Tuesday, May 26, 2015

Sometimes even Nightstalkers drink decaf

I don't make a habit of drinking anything decaf... I'm a Nightstalker, for Pete's sake! My normal routine is a cuppa tea before going to work, and then a second once I arrive at the office so that I can be calm whilst reading my email. Tonight, thought, is the tail end of a three day weekend, so I'm not a Nightstalker. I'm more closely related to Mr. Mom than anything,

Before I get to the Mr. Mom thing, though, I'd like to talk a bit about tea. Quite literally, before Jennifer and I got married, I was the quintessential comic book/cartoon knuckle-dragging Neanderthal tea brewer (those readers who are from areas where there is a strong tea heritage might want to skip this part, or have one of those inflight distress bags handy. In fact, you might need a trash can). Back in my bachelor days, my morning tea ritual went like this: get a shallow sauce pan and fill it with cold water. Bring the water to a rolling boil. Without reducing the heat, carefully drop in one Lipton tea bag. Continue boiling until you can see a brown ring which marks the original "full" level of the pan. Turn off, discard teabag, and pour into a cup. Add a teaspoon of sugar. What were the attributes of this tea? Well, most lava flows in Hawai'i were less viscous that this stuff, and NASA has black holes on record that emitted more light than this tea reflected.The "flavour", if one could describe this brew, was somewhere between "turmoil" and "despair".

I'm not certain of what exactly it was that caused my tea preparation habits to change, but when I got married I went from Neanderthal to tea snob. My tea tastes have broadened quite a bit, although I generally still drink mostly black teas from the likes of Lyod, Tetley and Thompson's- as well as one or two Indian brands. For the most part, though, all of them are prepared in a similar fashion. Prepare cup by adding a teaspoon of sugar then the tea. In the case of bagged tea, it goes in the cup before the water. For loose tea, I have a red tea filter that is very close to the red that the Swiss company Bodum uses in their tea and coffee products. A rounded teaspoon of loose tea is placed on the filter, and when the water reaches a boil, it is poured into the cup- on top of the bag or through the filter. A timer, which has been set to one minute, is then turned on. When time has expired, the teabag is retrieved from the cup and given a gently squueze to coax the last of the amber liquor from the teabag; for the filtered tea, the filter handle is given a slight tap, and then it is allowed to drip a bit into the cup before being cleaned out. I have a slight variation to these procedures at the office. Although I have a filter and loose tea, I generally drink bagged tea- primarily because the coffee machine (which has the hot water dispenser) is on the opposite end of the office.So, in lieu of a timer, after the cup has been filled with water, I put a lid on it and walk back to my desk. With the pouring of the water, affixing the silicone lid, walking back to my desk and seating myself, approximately a minute has passed, so I remove the lid, give the bag a loving squeeze, and I have my beverage of choice.

So, decaf? Yessir, yessir, two cups full. Jennifer is still out of town, and while Mr. T has been quite good at emptying the dirty clothes hamper, he hasn't really bothered notifying me that the laundry is full. So, tonight I had three loads to wash and dry. In my defense, we've had a fair amount of rain the past few days, so I've had to wait for that to abate.

But I digress.

Some of you may be familiar with the saying, often attributed to Mark Twain or Benjamin Disraeli, but actually coming from an article by Leonard H. Courtney: "There are three kinds of lies: lies, damned lies and statistics." Truer words have probably been spoken, but as frequent readers may be aware, the unifying thread of this blog is data. And with that, ...

I received a letter from Commonwealth Edison, our electrical utility. Now, ComEd, like most if not all utilities, is trying to be "green". It's a sensible position to take, quite practical, and makes them look like good corporate citizens. As someone who is keenly interested in data and applications thereof, I read the letter with great interest. Now, I've received letters from them like this before, and they're quite interesting. A couple of graphs, some numeric comparisons, and some helpful suggestions on doing your part to conserve energy.

Judging by the graphs and numbers, we're energy pigs. According to their statistics, we used 84% more energy than our neighbors last summer.

We're just plain bad, right?       

Well, maybe. Maybe not. You see, our family is different from the other families in our neighborhood in a couple of significant ways which the lies- er, stats, cannot reflect. With few exceptions, there are two types of families in our neighborhood (and in our neighborhood, all of the homes are free standing, single family houses). One type is a young family with school-aged children (grades K-12), and the other type is retirees, including singles, couples, widows and widowers. Our family has four adults; the two adults who work outside the home have very nonstandard hours. Our older son works in retail, and his schedule can have him working any day of the week, sometimes getting up as early as 0530. I work nights, and usually get home around 0500, but often later. Because of the strange hours, my wife usually does not get to bed until 0100 at the earliest. So, during the week, our house might see four to five hours of "normal" nighttime. During the day, two or three individuals will be awake and active.

How about the other families? Mom/Dad get up at 0500-0600 to get the kids off to school and get themselves off to work. Dinner ~1800, bed for kids 2000-2300, bed for parents 2200-0000. These homes will have a "normal" nighttime of closer to seven hours.

Singles and retirees? Similar to the family hours, with retirees probably closer to eight or more hours of normal nighttime. Also, much less cooking, laundry and climate control.

All things considered, I don't think we're doing badly at all. In fact, given the additional information I've considered, there aren't really that many "efficient" neighbors that are actually efficient. Just one example for your consideration: in the past week, I think I've done five large loads of laundry; I'd bet that the widow down the street may have done one small load in the same period. Who's more efficient?

For truth in data, I've got a few speeds and feeds from my growing Lego database to share. From a development standpoint, it currently consists of four datasheets- I've done nothing so far with the fourth, as it is going to be the summary page. I'm fairly certain that I'll be adding a few more worksheets- what I currently have are basic bricks, plates and Technic. The current grand total of all elements (parts) is 6,475. In the For What It's Worth Department, I think the highest count I've ever gotten is 24,000.

Monday, May 18, 2015

Data, defined (part 2)

Right after I hit the "PUBLISH" button on my last blog, I realized that I wasn't done. I know I had the option at that point to pull the piece back and add the other thoughts, but I don't like to throw out a wall of words just because I'm not done... I'd much rather give the reader a break and come back another day, and so here we are today with a continuation of sorts, taking a closer look at microdata.

But first, an update from the home front.

Sunday the 17th was the third Sunday that Jennifer had spent in the Dallas area. Our older son was off at a convention, leaving Mr. T and I a very quiet weekend. That's a good thing, too, as I still managed to rack up a sleep deficit. I've mentioned a few times that I'm a programmer that works nontraditional hours. I refer to my band of coworkers and myself as Nightstalkers. The big plus and big drawback of being a Nightstalker is that one often gets to stay at work until the job is done, which can sometimes mean a fairly long day, but the plus is that we are compensated for that time. Saturday ended up being a late day for me- nearly eleven hours, and then a technician was coming over to the house for the Spring air conditioning checkup at noon. At some point before noon I decided that I could not stay awake, so I asked Mr. T to wake me up when the tech arrived. The tech arrived and did his thing. I wrote a check for his service, and then went back to bed, getting up some time around 2030. Looking back, I really don't remember too much of what I did except for a bit of work on the Lego database. I was back in bed ~0430, and up Sunday a little after 1230.

Sunday was warm and the humidity was palpable. I opted for some breathable training attire to cut the grass. I have to say that I am perfectly capable of wearing some pretty nice-looking clothing combos, but fashion has little place in my workout or working outdoors clothing choices. As it was both sunny and windy, I had an Aussie-inspired wide-brimmed hat with a chinstrap. The short sleeved shirt and shorts were both black sweat wicking workout attire, and the footware: orange sneakers. Blood orange red, actually. New Balance all terrain running shoes. Peer reviewed, double blind studies utilizing FLOOS and LRBL have verified that these shoes allow me to cut the grass 19.3% faster than the average suburbanite. You read it on the Internet- it's got to be true!

After cutting the grass, I figured I'd take a walk. One would think I'd have learned my lesson from the last time I did this (two weeks ago, actually). No. No I didn't. I grabbed a fanny pack (these workout shorts don't have pockets) and headed out. Approximately an hour later I walked back into the house, drenched in sweat carrying an empty half liter water bottle.

All of that is a great segue to microdata. Why? Well, for starters, I have an Omron pedometer. I have the option of publishing my workout data to their website- in which case, my data would be a part on Omron's small data, and quite possibly, fitness big data. My choice, though, is to upload the data to the Omron tracking program on my computer, making it MY microdata. In the FWIW category, I logged 6.2 miles (13.64km) today- my best day in nearly two months of tracking.

The Lego database is growing slowly. I'm using Excel 2007, and having to relearn some things. I'm sometimes asked what should someone learn in Excel to be useful on the job. Well, it depends on the job. Every place where I've used Excel I've needed at least a few things that no one else asked for- and none of these were financial or statistical environments (which tend to be a lot more predictable in terms of desired skills). The Lego counts stand as follows:  Basic bricks- 1 part number, 12 colors, 1666 elements. Plates- no counts as yet. Technic- 3 part numbers, 3 colors, 1049 elements. Total elements (pieces)- 2715.

As always, I am hochspeyer, blogging data analysis and management so you don't have to.

Wednesday, May 13, 2015

Data, defined

Although it is not my intent, I am certain that this post has the potential to step on a few toes, possibly bruise an ego or two, or ruffle some feathers. I may even get someone mad. Really e-mad.

For starters, I do not have any letters, diplomas, certifications and am not currently professionally employed in whatever one might consider the "data community". Whatever that might be. I do not claim to be an expert or have any special expertise or training in the areas of Big Data, the Internet of Things/Everything, Statistics, Analytics or The Cloud. I was once employed as a data analyst working with Small Data for a short time.

Whew!

So, who and what exactly am I?

I'm a guy who tweets (and retweets) primarily on the subjects of Big Data, IoT, programming and related topics. As far back as high school- maybe even earlier- I've been interested in data. It was either my music collection or Fletcher Pratt's Naval Wargame that gave me my start in classifying and quantifying. I remember even attempting to do a few music surveys way back when, and some of the respondents were unhappy because the polls were not simple popularity contests, but the answers were weighted based upon their position on the poll. Fast forward to today. I'm currently building a flat database of my Lego collection in Excel 2007 (why 2007? Because that's what I have on the computer nearest to the Legos!). This, in turn, will be added to my master database Forty-Two- so named because it answers the question of Life, the Universe and Everything.

Having said ALL of that, I'd like to start off by saying that the term "data" may not be as concrete as we are lead to believe. In my world, data comes in the following flavors: Big Data, Not-So-Big Data, Small Data, Micro Data, and Statistics. Depending upon the size of the dataset(s) and one's perspective, most- if not all- data can fit into more than one classification. Really? Sure. Case: say there's a hypothetical high school senior who is one of the stars of his basketball team. He's a good defender, doesn't get a great deal of fouls (below the league average), and is about average in scoring- except he leads the league in free throw percentage. Several colleges and universities are interested in him- they've got data on this fellow going back to 6th grade. That's data- to them. To me, a person who could care less about basketball- it's nothing more than a bunch of irrelevant stats. On the other hand, these same scouts would not be impressed by the number of PhD's that follow me on Twitter.

So, how big is a Big Data dataset? I asked a coworker. He wasn't sure, but thought a mail list might qualify. Don't laugh too soon- some of the mail lists I've seen have more than 10 million names. To me, though, I'd put that in the Not-So-Big Data or Small Data categories. The IoT,  Amazon, Google, Youtube and Wikipedia definitely fit into the Big Data category, but to the average person, these can be tough to visualize. So, for what I think might be a decent, understandable Big Data dataset, I propose the 2010 U.S. Census. It was a 10 item questionnaire (with a few extra answers possible) that mailed to 135,000,000 addresses representing approximately 309,000,000 persons.

Small Data could be a database, a website or the phone directory of a small to medium sized city- the lines are pretty fuzzy here.

Lastly, there's microdata. I'm not sure if this term is used anywhere else, but I find it to be a convenient term for personal data- data generated and maintained by one person or one family for their own use and not often formally shared. A cataloged collection of coins, stamps, recipes, exercise/workout logs or Legos- all of these are Microdata in my worldview.

Thanks for your patience- I hope you enjoyed this. I generally write a lot less... I'm not a fan of writing or reading walls of words!

As always, I am hochspeyer, blogging data analysis and management so you don't have to.

Tuesday, May 12, 2015

Batch v2.0

I figured I'd go with a very "I.T.-sounding" title for a change, and quite honestly, anytime you throw out verbiage resembling "v2.0", it implies that this is the new and improved version of something, and it should be checked out. I'm glad you've stuck with me thus far, as I'm about to possibly let some of the wind out of your sails of enthusiasm: this is not about a new an improved app or some software product. It's not a Python library or a Hadoop project.

Actually, it's short for the decidedly unsexy sounding "Bachelorhood Revisited", which is more less where I find myself this morning. The short version of the story is this: Jennifer's Mom had an upcoming medical procedure which required at least a one day hospital stay, and then some assistance once she got home. As Jennifer was pretty much the only family member available for these tasks, she flew over and is spending pretty much all of May with her folks.

So this leaves me in an odd state of quasi-bachelorhood: on the one hand, I can come and go whenever, turn the lights on in the bedroom at all hours, and generally don't have to be to quiet when I come in. Those are the "pluses".

In the debit column, I have to take care of the bills, meter reading, garbage- pretty much everything household related that Jennifer normally does (of course, she helps out tremendously in a remote capacity). I do some of the cooking and food prep... not much, really, but much more that when she's here. I do the vast majority of the dishes, and all of the laundry. And I have to warm the bed up all on my own. This all happens before or after work and on what passes for a weekend.

In summary: this bachelor thing is for the birds. I can't wait for her to get back!

Data news: I've been watching the (free) Google analytics for my blog, and I'm at a loss for this month's performance, which is a personal best for me in terms of viewership: with a little over a third of the month elapsed, I've surpassed my previous record month by 9.2%. And there are over two weeks to go! Woo hoo!

Lego data news:  The Lego spreadsheet continues to grow and evolve. I finished the count on the white 3001 elements (the basic, ubiquitous 2x4 Lego brick); next on the list are the red 3001 elements. The current grand total stands at 1792 pieces. I should note here that when you're holding some of those nifty molded ABS plastic toys in your hand, they are not parts or pieces or bricks... they are elements.

SUL news: The Secret Underground Lair officially has all of its furniture repositioned in the new and improved floorplan. There have been some unofficial murmurings, however, that one more move may be in the offing, although this would require disassembling my half of the office. The jury is still out on this... more news as it becomes available.

As always, I am hochspeyer, blogging data analysis and management so you don't have to.

Saturday, May 9, 2015

Life in the Twittersphere

I'm not going to do the parody lyrics here, but bring up the Eagles' "Life In The Fast Lane"  from their "Hotel California" album and you can sort of hum along- it pretty much works. I used to do a lot of parody lyrics, as a matter of fact, but as far as this blog goes, I really didn't make the connection with the Eagles until I actually saw the title of the blog.

For what its worth, this was going to be another "Hoodie Migration" blog, but that was before I read a few posts from the Outmannedmommy blog. To be fair, the Outmannedmommy blog is quite funny, but it may be NSFW (not safe for work) as F-bombs occasionally detonate- especially from guest writers! Still, I enjoy Mary Widdicks' writing- because her style is great, and I can relate to her writing.

Also, strangely (or not?), I don't read a lot of other blogs, Mary's is the only funny one I normally read; most of the rest are retweets on Twitter (nothing wrong with that) about the IoT and bigdata.

And here's the crux, the nexus of this post: Me and Twitter (if you ant to try your hand at parodying this, try Tom T. Hall's most excellent "Me and Jesus" or "Faster Horses, Younger Women, Older Whiskey and More Money".

Twitter is by far the number one reason for my dearth of blogs as of late. I've been contributing a fair amount to the Twittersphere, focusing primarily on Big Data and the Internet of Things. One of my coworkers constantly tells me that I'm a fraud (he's kidding... mostly) because I'm not a PhD, a data scientist, a statistician, or any number of other things associated with Big Data and the IoT... and at that point I always remind him that I make no claims to be anything more than a catalyst or a conduit for ideas. And even though I'm no expert, I've become fairly adept at picking interesting and informative stories, factoids, infographs and blogs. This is evidenced by several folks that ARE PhD's, data scientists and statisticians who follow (and retweet) me.

That's the really short version of Life in the Twittersphere, but I need to get to bed, so I am going to close this and hopefully get another blog out very soon.

News from the Secret Underground Lair (SUL): I think I had mentioned that Mr. T and I were planning a bit of remodeling in the SUL. I'm happy to report that the majority of that work is now complete. Well, the furniture moving is done. Everything that got moved out now needs to be moved back in- I wish there was a defrag utility for small offices... press a button on the wall, and all of the books, boxes, doodads, computer components, computers, etc would find the optimal place to rest and automatically go there.

Finally, data news (this blog is about data, right?) I've got a spreadsheet roughed out, and have put a bit of test data in. So far, everything looks good: according to my flat database, I have 71 Lego elements. 

As always, I am hochspeyer, blogging data analysis and management so you don't have to.