Wednesday, December 31, 2014

Using Krawkraw To Find Broken Links

I recently tweeted that I did just that: used Krawkraw to fish out broken links on the company's website.


From all the 187 pages, I was able to find the 8 broken links: the rogues!


Tuesday, December 30, 2014

Blowing a Gasket

It was the first time I ever sought out Alcohol to drown out the pain…

It was the period starting somewhere in the middle of 2013 to the early part of 2014. Life happened! Yes, life happened and I found myself in precarious situations I had little or no control over. It came with an unimaginable bout of anxiety and the feeling of helplessness. The anxiety was so much it seems it began to take a crack at my mind...

This was when I was made acutely aware of a distinct part of our well being as humans...our mental well being, and that it is as susceptible to malfunctioning and in need of care just as our physical well being...Not that I have never been through situations that were extremely stressful, I have...and I have my occasional bout of the blues, and fatigue...but none has ever matched the intense craziness of that period…

Things started happening…

It literally felt as if fire has been set to my brain...and then at some other times, it feels as if cold hands made of steel, were mercilessly being inserted into my cranium…

Then I became mildly scared of sleeping! Yes! The thought of letting go of my consciousness to be enveloped, by the brief oblivion that comes with sleep became such a terrifying thought…What if? What if what? I did not know. It just felt terrifying.

Thinking and concentration became a laborious task also, it felt I needed to apply as much as 10x the effort to perform simple cognitive tasks…remembering stuffs also literally hurts...

...and with all these pain going on inside, on the outside, I looked and acted absolutely normal!

And this was the most scary part. The fact that my mind had such capacity to unleash that amount of distress and pain, and yet it remains all invisible...at that point I understood how putting a gun to your temple could feel like the way out to some folks, how swallowing pills or slashing your wrist could feel like the escape yearned…

I had a brief empathy with certain dark corners of our humanity...I understood.

It was a period I would never have a repeat of. It has taught me the need to be gentle with life, with myself, to devout as much attention to mental well being as we do to our physical well being...we eat good, exercise, avoid toxic habits, so also is the need to avoid ultra intense situations, the benefit of surrounding oneself with positive thoughts and family, learn not to give a fuck, cast your cares, and even if life happens, learn how to still not give a fuck, or better still, if you can, know how to stick a middle finger back at life...just do about everything you can to shield yourself...

We owe ourselves this.


Friday, December 26, 2014

A web scraper/crawler in Java: Krawkraw

UPDATE: Krawkraw is now referred to as just Webmuncher.

Nobody sets out to write a web scraper, but here I am posting about just that, a general purpose web scraper I wrote…

How did this happen? Well, it started with wanting to play around with ElasticSearch a couple of months back. And in search of some data to index, I looked to websites' contents (with hindsight a more structured data format should have been sought, but well, the lesson has been learnt)...

So the question then was, how do I easily retrieve all the html pages of a site and have it thrown into ElasticSearch. Any easy web scraper in Java, out there that I could just use? I can’t even remember actively searching for such a tool...which explains why I somehow decided it would be much fulfilling if I throw a web scraper together...and in a couple of weekends, Krawkraw came to be.

Krawkraw is a tool that can be used to easily retrieve all the contents of a website. More accurately contents under a single domain. This is its perfect use case which reflects the original need for which it was written. So you can’t start at one edge of the web with Krawkraw and expect to crawl to another edge...Nope.

Krawkraw is available via Maven central, and you can easily drop it into your project with this coordinates:


com.blogspot.geekabyte.krawkraw
krawler
${krawkraw.version}


Or you can download the jar files here, and have it included in your classpath.

For more information on the API and its usage, The README should be your friend! If you happen to use Krawkraw and found it missing some features, or you have some ideas on some features that it should have? Then drop them in the issue tracker.

Wednesday, December 17, 2014

Hello Logging, Hello SLF4J

This is not a full blown guide, tutorial or introduction to logging in Java with the SLF4J abstraction, but a post that captures some of the nuances and components I personally had to get familiar with when I stopped seeing logging as an activity that happens only during development, to something that is actively build into software so as to have an insight into its state in production.

Personally...In the early days it was var_dump() and alert("It works"). Then I discovered the console…(thanks to Firebug and much later, Chrome DevTool) and I dropped alert in favor of console.log() And that was it... This was what "logging" was mainly about: A process of ascertaining the correctness of the software as we develop…

But as soon as I moved into building software systems on the JVM using Java, production logging became more upfront and also, the need for me, as a software developer, to proactively build logging into my application. It was not a complex switch, but looking back now, I realized that I unintentionally, for a while, approached writing production logging as if it were the ones that took place while development.

They are in no way the same...

First, there are more moving parts, second, the purpose differs: Including the ability for an application to write out log messages in production is to have insight into the internal state of the application and also, so as to provide a good source of forensics in case something goes wrong. It has even been estimated that any trivial system would have about 4% of its code base dedicated to logging. This shows how important production logging should be.

This post thus captures the new eyes with which I had to view production logging compared to the type I had long been familiar with; which is logging during development. It is written with logback in mind, which is an implementation of SLF4J, what has grown to be the standard abstraction when it comes to logging in Java

Almost all of what is mentioned would apply to any other SLF4J implementation. The difference that may exist would mostly likely be in the area of configurations...

...with that said, let’s start.