It is increasingly common for non-profits to own for-profit businesses, to engage in joint ventures with for-profit businesses, or to license intellectual property to for-profit businesses in exchange for revenue sharing or some other contractual arrangement.

The Big Data Model and my Geeky Ex-Boyfriend

So the AMAZING Big Data class is just starting on the first part of a multipart model. So… what could I possibly drive my teacher crazy with, when are just starting? I’ll tell ya. I wasn’t sold on where it starts. The model starts on the Storage phase. And the Compose phase comes way later. “Compose” meaning the figuring what to do with all the information. To me it sounded like, if storage space is a limited resource, we should start from composing or designing a storage strategy. No? Nope. Prof. Funcke patiently explained that Big Data starts from data in storage. 

Still, it didn’t make sense to me. You see, I’m a “workflow person” and I need to know what to do first, second and third. So I decided to prove the model’s order is wrong by contacting my personal guru in all things nerd: my computer engineer ex-boyfriend. To talk about his qualifications to be my go-to-geek I can mention that he is the Chief of Information Security at a big insurance company in Venezuela. Let’s call him Geeky (but adorable) Ex-boyfriend, or GEB for short.

So this is how my IM conversation went with him.

Me

Hi! First of all let me tell you I’m taking a class on big data.

GEB

That’s great!

Me 

And I’m trying to prove that the model we are using is wrong. My problem is with where it’s starts: from storage. And I ask: can’t I decide what goes to storage and what not?

GEB

But Big data approaches it like this: keep everything and then we organize. It’s like Mr. X (person we both know) but with infinite time. X keeps every single newspaper and then looks for something interesting. 

Me

But isn’t that a wasteful way to storage?

GEB 

But big Data believes that space is cheap and you never know what you’ll use, so you keep everything and then run queries. And then you think of another query and so on…And then you never have the issue of not having datapoint that you now need.

Me

But isn’t it true that most companies cannot afford that luxury?

GEB

Small and mid size companies surely can’t. 

Me

See?

GEB

But governments and Fortune 500 can.

Me

Boo…

GEB

On the other side Space IS cheap.

Me

But isn’t the idea of a model that it applies even to a mid-sized companies that has something crazy like 10k datapoints/hour?

AND THIS IS WHEN GEB WENT FULL ON NERD ON ME. It took him a while to respond. I imagine he was making calculations by writing on his window glass, Good Will Hunting-style. 

GEB 

I’m 26 years-old. I’ve lived 9490 days. If I kept a 5 GB movie for every day of my life I would have  47450 GB of Video. That is 50 Teras.

Me

And the cost of that?

GEB

1 Tera on an external disc is a 100 dollars so 5000 dollars.

So there you have it. An individual can store a huge amount of data for 5000 dollars. Of course I did not prove anything, but my exchange with GEB made me question how long it will take till midsize, small companies and even individuals can store every datapoint and then do something with it. And if they can’t afford to storage everything now, should they start from creating a storage strategy?