When I first saw a message from LinkedIn titled Add skills like “Ruby” to make your profile easier to find in my inbox I let out a little chuckle. Cute. LinkedIn crawled my Github or maybe the text content of my LinkedIn page and wants me to make sure they got it right by adding a skill into their formal system.
They got every single one spot on. I could come up with simple possibilities for any of those except for Python.
I started coding full time in python about 3 months ago and haven’t had time to open source anything. I haven’t tweeted about it. I haven’t posted about it on Hacker News. Nothing. All I’ve done is Google for “Python Object Inheritance” or “Networkx MultiDiGraph Methods” (best library ever, btw).
So here is some guesses as to how I think they did it:
They got lucky. Though Ruby and Python have completely different core models* they both feel pretty similar to one another.
They relied on a (possibly supervised) LDA-like model that essentially said “look, this guy is in startups, he is a data guy, and he knows Ruby, he really should know Python at this point”
They watched who started following me recently on Github and noticed that there was a bump in people that were proficient in Python. Similarly for twitter.
They bought a portion of my search data from one of those pixel tracker sites that power search re-marketing.
The thing that will be very interesting to see is how they use the other aspects of what they know about me to their clients (recruiters, and possibly clandestine intelligence organizations). This would be a gold mine for them. They could, for example, say they know that I’m trained as a structural engineer, a data scientist, and a developer. An organization looking to develop software that simulates wind over 50 years on a free standing structure to develop more detailed failure scenarios and risk profiles would be desperate for me. The time it would take them to find a guy like me would be immense, so the value of LinkedIn is in closing information gaps, but unlike Google’s search, they do it in an area that the market is willing to pay for up front.
If this post happens along the desks of anyone working at LinkedIn’s data dept. please feel free to email me if you want to have a chat about how you guys did it. I trust you know where to find me.
*(Ruby is a Object Oriented language with functional aspects for pleasure, while Python is a functional programming language that bolted on an Object Oriented paradigm)
One thing I have found is that economics has a heartbeat by the decade and that the number of factors is very high.
For example: I used to take Ireland as an example of a European country that degregulated and subsequently out-preformed nearly everyone. I would show graphs, etc. to everyone I discussed politics with.
Then of course the crash happened. Things are not *simple* they are very, very complex and worse growth happens over decades and during that time population demographics change, technology changes, relative resource pricing changes.
Heck, something that keeps my brain fairly active is the importance of intelligence distribution. What if countries with a lower average iq but a higher average top %1 are better able to organize? Where does this work and where would this be a disadvantage?
What about economy diversification? The Canadian dollar has a roughly 0.1 correlation with the price of oil due to Alberta and NFLD, but we also have manufacturing, tech, media, and finance. Does this stabilize our dollar and allow further investment into any one of the hot sectors or does it needlessly put the out of favor group up against severe currency pressure?
What about subgroup ideology shifts? Does Obama being president enfranchise blacks, the highest crime/welfare subgroup. If he does and there is a 10% positive shift in “outlook” and a subsequent drop in crime and welfare burden 10 years from now how well can that be discovered?
Ultimately I’m a Libertarian because of ethics, not economic expediency. But people want answers and often times “it is really complicated” isn’t the appropriate answer. There are so many inter-dependencies and factors that simple statements like “Sweden does it and it is amazing there!” are worse than useless. They allow people to cheat intellectually, as I once did with Ireland.
I can think of nothing more complicated and intractable than economics. If we do live in a simulation, I suspect our simulators are split testing economic policy.
or an infinite number of non-identical universe “cycles”.
Presume that the human mind can be uploaded to a computer either through;
an atom-by-atom simulation,
or a mathematical deconstruction of the graph of nodes/neurons that make up the human mind.
Presume that there are sufficient bits and ops in the universe to simulate a human mind.*
From the point of view of a simulated consciousness, pausing the hosting computer for a time then resuming it will go completely unnoticed besides the change of state of what is external to the consciousness. For example, if a clock was placed in front of a webcam of a computer that held a simulated human consciousness and the computer was paused for one hour and then resumed the human consciousness would report only that the hour had incremented on the clock nearly instantly, not that he saw black for an hour. This concept will be called non-conscious-time-irrelevance.
From the point of view of a simulated consciousness, pausing the hosting computer then perfectly copying its state to a new hosting computer, destroying the original hosting computer and resuming the process on the new hosting computer will go complete unnoticed to the simulated entity. This concept will be called simulator-irrelevance.
A Turing Machine in a finite universe can only contain a finite set of logical expressions. Since a human mind can be simulated by a computer it follows that there is a finite set of logical expressions that comprise a human mind-state.
Since there are an infinite number of universes, an infinite number of dimensions, or an infinite number of non-identical universe “cycles”, there are an infinite number of particles and particle arrangements.
Since there is no God, an infinite number of particle arrangements will not be artificially hindered from creating every possible organization and combination of these particles.
Since the simulated human mind-state is finite and every necessary particle combination that can happen will happen, the mind will continue to exist at some point in the past or future, and since non-conscious-time-irrelevance and simulator-irrelevance, it follows that from the point of view of the consciousness death is unattainable, even from a non-computer-simulated human consciousness.
* The argument that since human consciousness exists in this universe, it follows that there are enough bits for an appropriately constructed Turing Machine can be made.
I think I see what you’re saying, but I think for me there’s still an unresolved conflict between what your proof would imply and what we experience. Basically, if what you say is true, switching Turing Machines wouldn’t happen just at death, but at all moments along your consciousness. Also, I don’t see a particular reason why you’d switch into a reality that’s exactly the same as the one you switched from, just from probability you should switch into a reality that’s different. Shouldn’t you notice a lot more external changes during switches?
Which is an excellent point, but in no way refutes the proof, which does not require that current observers need to have the expected path. I responded weakly with the following:
I guess it would come down to a very hard statistical problem. What is more likely, a reality based body or random bits in some extra-planar computer happening to come into alignment that would form “you?. On the one hand, when you die even if you have to wait 10^51 years eventually you will come into existence again, even if only for a second.
But explaining our experience is not necessary because our experience does not refute the proof.
I personally reject the proof because I don’t believe all the premises, but if the premises are revealed to be true, then I would accept that death is unattainable. Which is interesting because the lack of a omniscient, omnipresence God is a fundamental requirement of the proof, unless that God truthfully promised mental immortality or continually random universes.
If the premises are true I would suspect the reason one doesn’t seem to pop in and out of multiple realities would probably have to do with the computer simulation argument. Which would drastically increase the ratio of bits and ops that are organized towards intelligence in the universe as well as providing a higher ratio of predictable continued realities.
Update2: based on feedback I realized I needed to clarify what is meant by “infinite universes”. What I mean is universes or dimensions that are much like ours (same physical constants particle sizes, etc), but where the initial conditions were slightly different resulting in different distribution of stars, planets, etc.
Update3: It turns out I’m not the first to have this sort of theory. People have brought up lazy immortality as someone that makes the same argument from a different angle. Also, Permutation City was brought up as a book that encapsulated some of the ideas I presented.
Lots of great feedback from readers on this one. Feel free to reach out to me on twitter or gmail if you have anything else that you think would be interesting to add.
Yes. Given my limited view, Color is probably overvalued. Yes, Groupon better role out something new, and fast, or it will never give any decent ROI for its investors.
But enough with the anecdotal evidence. For each Color or Groupon there is a Google (also had high valuations) or a Dropbox (quite possibly the most awesome company on the planet).
The real question about whether a bubble is happening.
“An economic bubble is trade in high volumes at prices that are considerably at variance with intrinsic values”
Right now interest rates available in Canada are south of 1% and from what I can glean online current “risk-free” interest rates in the United States are even less.
At less than 1% it would take over 70 years to double an investment. (I put aside inflation because A. it will only further my argument and B. I have a controversial view on inflation that I don’t want to get into.)
Let’s look at the expected returns from startups. I was going to write a CrunchBase API client, but luckily a site already does this for me.
So first thing to notice: Total amount invested is barely changing, even despite plummeting interest rates. Second, and more importantly, total acquisition amount: $450 Billion. This isn’t including Facebook, or any other cool startups that haven’t been acquired, it doesn’t include companies that went IPO, it doesn’t include companies that have always stayed private, unfunded and quietly paying out dividends to their founders.
But lets work with $450 Billion just because we’re trying to really make the case.
60,000 companies on CrunchBase with a total of $450 Billion in acquisition spend.
Imagine it was like buying a lottery ticket. If you invested at valuation of $7.5mm per company right from the beginning (seed stage or Series A) you would expect to break even (well, actually you would probably have a 1X liquidation preference, so in reality it would be higher than that but lets ignore that).
Complaining about “sky high valuations” is crazy talk. Most of the companies coming out of the best incubator in the world are getting about $5mm well below the $7.5mm figure and that doesn’t even account for the fact that they are objectively better performers than a random sampling of companies.
The truth is this: Bubbles don’t exist without my aunt’s mutual fund getting involved or my next door neighbor getting told to mortgage his house to invest by his financial advisor. Web 1.0 was all about IPOing on the nasdaq and fleecing the public with business models that disregarded profit. This time around things are different. At every stage of the process you see startups with business models. Guestlist, Github, FreshBooks, heck even Groupon, are making nontrivial money relative to the valuations and expected future growth.
People are getting online, and living online. Just as it wasn’t a bubble when we entered the automotive era and huge car companies were popping up everywhere, it isn’t a bubble when whole groups of people are spending more time online than watching TV or book reading or newspaper reading.
Except pure* social.
Stay the %&#* away from social. Go find an online circuit board diagram company to invest in or something.
* to differentiate from something like Github which calls itself “social coding”.
But I wanted more. I wanted to write my name in Weave Silk for a personal website I’m putting together. After trying to write it out manually and not being happy with the results (colors were never what I wanted them to be, my hand couldn’t stay steady for long enough, I would miss time the wind) I decided to look into the source code, even though I barely know how to write a “hello world” program in JavaScript. Obviously that failed nearly instantly. I’m sure I could go through it and do what I want to do, but that might take days or longer.
I use these as a last resort (usually in Excel) when deadlines are fast approaching. Compared with Ruby, or even VBA, AHK scripts are a source of constant surprise.
I’ve always needed to riddle my code with “sleep 10” just to get the most basic key presses to work reliably. The script tries to execute so fast that either my OS or the application I’m using doesn’t register the scripted key strokes.
Also, in certain cases, there is just no way of using a variable where you want to. Rather than setting variables as being accessible by prefixing them with an “@” or “@@” as in Ruby, you need to invoke the keyword “global” within the function you want to call the otherwise inaccessible variable. But in predefined functions it raises an error when you call “global” so you are out of luck. I’m sure that somebody somewhere knows how to do it, but I’ve googled around enough to give up trying to solve that problem.
As my startup has been getting nearer to launch, I’ve made an effort to reach out to people that I’ve helped or connected with over the past year. I wrote individual emails to 95 people, recalling when we spoke last, what they are working on, how my startup is going, etc. These were heartfelt, non-spammy reach-outs.
My first 30 were discouraging. Under half of the people I emailed got back to me. So I tried something new: a hard cap of 500 words, but under 100 was what I aimed for.
Since the switch every person has replied.
(This blog post is 97 words, not including the title or this sentence.)
I just did it. I wrote my first “Hello, World!” program in my own little programming language. A tiny crest of a hill on my way up the mountain.
The first thing I wanted to do after I came back from a smoke was to throw on a movie. It’s 10 pm on boxing day, and it isn’t like I’m on the clock at a day job. But I think I’ve finally learned that stopping after a minor success is something to fight against. So I’m moving forward, for two reasons:
Stopping means less stuff gets done. (duh)
Because I’m in the zone, stubbing out what to do next will be much easier now than later, which means picking up where I left off later will be easier which means I’ll get more stuff done.
I’ve found that pushing past the initial highs into the next phase, whether it is coding or otherwise, means you get more done and life is better.
I haven’t fully decided on the name of my programming language, but in case you are wondering it’s going to be highly concurrent, data analysis geared language with influences from Erlang, Ruby, and Anic. Not only will every line of code try to execute at once, they will be fun to write, like Ruby!
[P]ractically all the returns are concentrated in a few big successes. The expected value of a startup is the percentage chance it’s Google.
He then goes on to say
Some super-angels seem to care about valuations. Several turned down YC-funded startups after Demo Day because their valuations were too high. This was not a problem for the startups; by definition a high valuation means enough investors were willing to accept it. But it was mysterious to me that the super-angels would quibble about valuations. Did they not understand that the big returns come from a few big successes, and that it therefore mattered far more which startups you picked than how much you paid for them?
This has got to piss off some people that invest in startups for a living. Especially coming from a guy that typically gets 6-7% of a company for at or under $30k. I’ll be analysing Fred Wilson’s response below, but first you should read it in its entirety here. (protip: pressing the middle mouse button opens a new tab, I’m still shocked by how few people know this)
On the surface it looks pretty reasonable. He took a 2004 fund, so there should be enough time that has gone by. The first thing that struck me was that he only had two companies go bankrupt during this time. That is outstanding. Fred is clearly an expert investor, his insights are amazing and along with Gabriel Wineberg (who also had a quibble with the Paul Graham post) Fred’s blog is one of the very few I follow outside of what is submitted to Hacker News.
But here it looks like he is wrong. Not only that, he proved Paul Grahams point with his own data.
The first thing to remember about investing is that you don’t care about 10x returns or 2x returns. I would take doubling my money in a day over 10 folding it over a lifetime, as would any other sensible investor. It is the compounding returns that matter.
So the first thing we need to do with Fred’s graph is convert it into a spreadsheet.
Woah, lots going on there, so let me break down what we have done. I manually counted up the number of companies from the original chart on Fred’s blog post. I’ve assigned a value of 0.5 to the bankrupt companies because I know you at least get tax breaks when you lose money and that in certain circumstances companies can get sold for their on book losses. It might be too high, could only be worth 0.1, but it just helps my argument for it to be 0.5, so I’m going to stick with that. (At least I’m honest.)
Next what we do is convert the 25x and similar returns into their yearly compounded return rates and bucket them into the nearest decile of percentage. Which leads us to…
Maybe not truly bimodal, but not bad for the sample size we are working with. Sure, if I put the value of the bankrupt ones down at 0.1 value they go to -30% per year, but really, Fred isn’t making money on the people that are only worth 1x six years later anyways. So really, his returns are bimodal(ish).
Really it makes sense that returns are bimodal, especially in software. The cost of the next incremental sale is nearly zero once your product becomes commonplace, so it is natural for a whole host of startups to fail early on (high upfront costs, like developer salaries) and for a few to get into growth stage (highly optimized sales cycles, enough volume for split tests, a recognizable brand, marketplace trust, cheaper capital, CPAs below NPVs, leveraged coder hours, etc) and beyond.
With a few exceptions, in software you either make it, or you don’t.
I will make one point though, based on some back of the napkin calculations of Google’s Series A investment size, market cap in the year Google went public, and what is openly available of what the founders of Google continued to own (20% each), I’ve estimated Google’s annualized returns from the time they took Series A funding to the time they went IPO to be somewhere between 125% to 200% (that’s annualized(!)). Which is clearly not what Fred is making on his stars.
This discrepancy is just fine. Obviously Paul Graham doesn’t think that to be a successful Angel you need to get a company that pulls in triple digit yearly gains. His point is that you don’t let the really good ones get away because they are asking twice as much as you were expecting, the Series A venture fund that worked with Google didn’t. Returns are bimodal (or quasi-kinda bimodal). One interesting observation is that Google was notorious for the founders having a large equity stake so late in the funding game. Just one point of data, but maybe good founders know not to give investors an unreasonably large amount of equity.
People have trouble visualizing large numbers of things. After looking at a terrible infographic on cnbc, I decided to take the ugly-but-works approach. Most people have a feeling for how large the twin towers are, so I decided to map the amount of oil in numbers of world trade center towers. It came out less than I expected, just over 40% of a single World Trade Center’s volume.
This comes back to something I think about often, when it comes to infographics there is often a fight between useful and pretty.
For extra understanding, click here to see a link showing the building footprint in red on a map that you can zoom in and out on. Really shows you the scale of the earth to how much oil was spilled.
True, False, & NULL/None/nil/Blank logic in MySQL, Python, Ruby, and Excel
(NOTE: This blog post is extremely old and may give the reader the false impression that I have no idea what I’m doing. I’m keeping it up because I’m not a coward.)
EDIT: see below for an explanation on the “nil and False => nil while nil or False => False”
Being a Data Guy (in the corporate world, a “Business/Market Intelligence Analyst/Manager”) has its advantages. There is always a new question to answer, a new system to optimize, and a new split test to run. Management makes a ton of money through discoveries found in the data, and that means that bonuses are not far behind. But one dark side of being a Data Guy is the sheer number of tools you need to use (quickly!) and the inconsistencies across them. Even if some of them are as cool as Ruby.
Allow me to introduce True, False, and NULL problem. On the surface this doesn’t seem that hard. How many inconsistencies could there be across just 3 values (or lack thereof in the case of NULL) represented in 4 languages?
As it turns out: Many.
The old pros out there are already screaming, “hey! NULL in MySQL has nothing to do with nil in Ruby! This discussion is meaningless! Don’t you know anything? Go read some books by W. Richard Stev…”
Well, they have a point. NULL in MySQL is not the same as nil in Ruby, but since this message has been pounded into me the hard way I was hoping to save others the headaches that I’ve befallen upon. Grey beards, go back to kernel hacking for now, we’ll grab beer later.
The “Or” Logical Operator
Ok, so far so good, we can see that regardless of the tool used “NULL or TRUE” returns “TRUE.” And it should, given that no matter what is on the other side of an “or” operator we already have a “TRUE” value. The “NULL or False” column shows an inconsistency between MySQL and the rest of the bunch. Why does MySQL return “NULL” when the rest of the group give “FALSE”? Pretty easy answer there is that in MySQL “NULL” basically means “unknown”.
If that still doesn’t answer the question in your mind, imagine this: You are a home to home surveyor, asking people about their favorite politician. You get to a house to ask the owner whether they are going to vote for the Libertarian Candidate or the Constitution Party Candidate (there are many surveyors in dreamworld, you see) but you can’t tell if the person is male or female, say because they are behind a screen door. When you get to your trusty MySQL database you leave the value of NULL in the “is_a_man?” column because you cannot tell one way or another. That is what the MySQL guys and girls had in mind when they decided “NULL or FALSE” returns “NULL”. They basically said the meaning of “NULL” is “unknown”. So when we say “NULL or FALSE” we really can’t tell one way or the other. This is especially important in numeric type fields where we can’t just print the string “couldn’t tell the height in inches due to darkness.”
What about the rest of the bunch, why do they default to “None, nil, etc…”? Because to them the lack of a value indicates exclusion from the logical operator. In other words, imagine you were a computer program interpreter. Whenever you saw the word “nil” (or otherwise) after an “or” operator you basically said “screw it, forget about that guy, he hasn’t called me in months”. That would describe Ruby, Python, and Excel. They only look to the remaining side of the “or” operator.
What about the last column of the table? Well, MySQL stays consistent with the whole NULL = Unknown value situation, basically saying: “here I’m giving you a NULL, but that actually means ‘unknown’ because either of these two values could be a true”. Totally cool and consistent.
Here we also see Ruby and Python try to tell both sides of the “or” operator to go #*&^ itself, but are left with nothing so they say “got nothing” in their own way, nil and None, respectively.
Excel is different. And as we will come to learn, when in doubt Excel will be different. It basically says “Hey! How’s it going? So I’m trying to hide this concept of ‘NULL’ and ‘nil’ from you because you guys are mostly MBAs that have a hard enough grasp dealing with a blank cell, but you are making it really hard for me. So I’m just going to throw up the error ‘#VALUE!’ and you can call over the programmer paid one fourth as much to figure it out for you.”
This is a recurring theme with Excel. Excel doesn’t really have a NULL or even nil field. Excel has a “Blank” field, which it will happily take as input, but will almost never serve as output, so it gets creative. (As a side note, entering a single “’” into an excel cell gets you an “empty non-blank field” useful for when the default behavior isn’t what you want. This comes in handy everywhere, from conditional formating to logical comparisons to type safety.)
The “And” Logical Operator
Here is where things start to get batty. Let’s start with the “NULL and TRUE” column. MySQL handles this situation as we would expect, given that in MySQLese “NULL” just means “unknown”. Ruby and Python can’t tell the nil/None boys to stuff it this time, because there is a pesky “and” operator, which foiled some plans for world domination, but that’s a story for another day. They basically say “hey, you may actually want to take a look at this sarge”. Personally, if I were a programming language interpreter I would either say “nil or false => false” OR “nil and true => nil” not both. I’d make it easier on the girls dancing with me. Then again, 80% of my keyboard time is spent with MySQL, so maybe I’m just used to blonde haired girls.
Things were just starting to get boring until Microsoft stepped in. Ladies and gentlemen Microsoft assumes that, if you aren’t FALSE you may as well be TRUE in an “and” operator. Blank, the number 3, “luapnor” - it doesn’t matter. Like republicans in negative land, everyone is their friend until you specifically tell them you hate their guts. Just for fun, try this in cell “B1” in a brand new Excel spreadsheet: ‘=if(A1,”lol_true”,”fffuuu”)’ should come out to “fffuu”. Great. Now replace it with this: ‘=IF(AND(A1,TRUE),”lol_true”,”fuuu”)’ now what does it come to? “lol_true”.
So why is this? It comes back to MS Excel trying to hide complexity to MBAs. Given an “and” operator Excel will ignore any non boolean values to make the lives easier for people just messing around with spreadsheets (these often can have huge holes in them that MBAs want to ignore). But when there is no “and”/”or” present, the if statement NEEDS to try to use the blank cell, which it does in the ’=if(A1,”lol_true”,”fffuuu”)’ Excel cell.
To the next column!
MySQL still acting as it should. Treats “NULL” as “unknown” and predictably says “no matter what ‘NULL’ was supposed to be, I have one FALSE, so I can safely say “FALSE”.
Excel will predictably say “F that NULL, I’m just going to ignore it. This whole thing is False.”
Which brings us to Python and Ruby. I don’t have words for why “nil or false” gives me false in Ruby (and Python), but “nil and false” gives me nil. Even if Ruby told nil to get out town it would still be left with FALSE. I’m going to try to keep my faith in Ruby by saying this fifteen times slowly: “Ruby was made by intelligent people, there’s a logical explanation… Ruby was made by intelligent people…”
The last column makes total sense. NULL and NULL should be NULL. you have nothing to work off of! No idea about anything. True, False? “Is the answer to this question the same as if I asked you if true and false were the same thing?” Of course Excel wants to say NULL, but it can’t because people like Mitt Romney might throw a fit if they see an term so unfamiliar as NULL. So it just throws an error, which makes sense when you look at all the other behavior Excel has exhibited over the past couple paragraphs.
The “!=” Logical Operator
You know the drill at this point. “TRUE != NULL”: makes sense. NULL for MySQL ‘cause NULL is basically unknown, the rest say “hey, you know what True really isn’t equal to NULL, that MySQL guy is on smack.”
Next Column: False != NULL, MySQL says “NULL could be anything. It could even be a boat! So we can’t tell if FALSE != NULL.” Ruby and Python ask us what MySQL is smoking and say that, without a doubt, that FALSE isn’t nil/None.
Then comes Excel. Apparently, to excel, an empty cell is not not equal to false. Which means it is equal to false. Except when you put it in an “or” operator or an “and” operator,then it isn’t false, otherwise those things wouldn’t have thrown an error, they would have returned false, or when used with an and operator they would have acted like false, not true.
Er…
At this point I don’t even care anymore. Onto the last column: NULL != NULL. Makes sense all round. Even in Excel.
The “Greater than” Logical Operator
I just included this one to make sure everyone knew that Python and Ruby couldn’t be counted on to return the same thing. Of course I’ve only chosen to look at some aspects of the NULL/TRUE/FALSE problem. I’m not even going to get started on “not nil or nil” problem. Or PHP. Or Ruby’s difference between Case equivalence and normal equivalence.
I don’t really have an overarching conclusion in this blog post. No “wtf Micro$oft engineers are dolts!” message. I look at each tool fulfilling each role extremely well and for their intended audience. There is a difference between NULL and nil, even if it makes it harder on multi-tool Data Guys, like myself. At least business-y people will feel more comfortable rocking a spreadsheet when their whole world is true-false-error, and I mean this sincerely. I must say, though, that I certainly do prefer MySQL’s consistent handling of NULL. Makes me wish that Ruby had its own type of unknown class. Also, don’t think that I don’t have love for MS Excel. Excel is amaaaaazing. By far MS best product. Understanding its quirks is just part of life.
If anyone wants to add to this list, I’ve made a open spreadsheet here. It is a Google spreadsheet, which I guess is kind of funny after all the MS Excel explaining I’ve had to do. Any questions for me? Send ‘em over to p.engineer@gmail.com.
Update from Pavpanchekha, whom emailed me following my post (many thanks, I KNEW there was a reason):
Wanted to explain the strange inconsistencies you saw in your recent essay on Nil in Python/Ruby/Excel/MySQL.
You questioned the sanity of people who decided that nil and False was nil while nil or False was False It all comes down to short-circuiting operators, that is, a specific optimization/feature that ever programming language known to man has. (I stress programming, as MySQL and Excel really aren’t programming languages). The basic idea is that an and statement, and an or statement, will only examine as many elements as they need to decide their value. This is good; if you have, say: cheap_function_that_is_usually_false() and big_computation() you will almost always not compute big_computation(). It’s also a feature, not just an optimization. In python: if len(s) > 1 and s[1] == “bob”: do_stuff() if and weren’t short-circuiting, this would cause errors, since it’d try to access s[1] even when len(s) == 1 (and thus there is no s[1]). Now, how does the short-circuiting work? Well, it’s simple, really. For and: evaluate the first argument; if that’s false, return it, otherwise, evaluate and return the other argument. So: nil and False Well, we evaluate the first. We get nil. Is nil truthy? No, because we want “if nil” to not do anything. So we return nil. False and nil Well, we evaluate the first. We get False. Is False truthy? No, so we return False. And yes, this does mean that and and or are no longer commutative. But the benefits are great enough to justify it. For or, the algorithm is similar. Evaluate the first. If that’s true (truthy), we’re done, so return it. Otherwise, return the other. So: nil or True We evaluate the first. We get nil. Is nil truthy? No, so we return True True or nil We evaluate the first. We get True. Is True truthy? Yes, so we return it. In this case, both orders were the same, but they’d be different if something truthy was used in place of nil: try True and 1 vs 1 and True Hope this explains something!
I don’t really do illegal things. I’m actually a pretty top notch guy.
Some of the stories in the article may be embellished/fabricated. I do have to say that, right?
Bruce Schneier is one of the greatest minds of our time. Schneier on Security, is a collection of some of his best essays from 2002 to 2008 and has really shaped my thinking towards privacy and security. (Also available, at a higher price but personally signed, from his own website).
Back in December he posted this in response to Eric Schmidt’s (CEO of Google) claim that:
If you have something that you don’t want anyone to know, maybe you shouldn’t be doing it in the first place. If you really need that kind of privacy, the reality is that search engines — including Google — do retain this information for some time…
I’ve thought about Schneier’s response (that people want privacy for a whole host of reasons, like when we make love, sing in the shower, and do things that are totally legal at the time of law) for some time now and I have come to this conclusion:
Yes. I do want privacy for those reasons. I do not want people knowing when I search for “smelly foot rash” or, even worse, “why do women cheat on good men”. These are embarrassing or very emotionally painful subjects that I don’t want anyone to know about. Say there is only a 0.1% chance that in the next year Google’s servers have a search history leak (between all their sharing of data back and forth with the US government). If it does happen, my searches will forever be available for people to find. I’m always logged into my Gmail account, so my coworkers wouldn’t even need to know my IP. All they would have to do is search “[my email] google search history leak” or possibly just my full name.
But that isn’t everything. I want privacy because I break the law and I don’t want to be fined or thrown in prison. No, I’ve never done or dealt illegal drugs. No, I don’t jack cars or commuter bikes. But I do break the law. Probably every day. Some things are minor: 12 km/h over the limit, parking for 2 seconds to drop something off when the sign clearly says “parking after 8 pm only.” Some things are major: keyloggers and password dictionary attacks while the Grade 11 English teacher was out of the room.
(Sidenote: My friends and I were stupid in high school. We never got caught with our hackety, crackity shenanigans and we I never changed my grades. But it was still stupid. I also understand the hilarity of blogging about privacy after installing keyloggers on highschool computers and dict forcing teachers email passwords. At least I have the I-was-an-idiot-teenager excuse, unlike some major corporations.)
What about not even knowing about breaking the law? Let me ask you this: Have you ever committed a felony? Before you answer, have you read through and understood the millions of laws you must abide by? If not, your truest answer to the original question would be “I hope I haven’t committed a felony, and if I have, I hope nobody finds out because I don’t want to go to prison. I’m basically a good person and I don’t deserve to be financially ruined and separated from my family.”
It is unlawful for any person to import, export, transport, sell, receive, acquire, possess, or purchase any fish, wildlife, or plant taken, possessed, transported, or sold in violation of any Federal, State, foreign [!?], or Indian tribal law, treaty, or regulation.
…
Criminal penalties fall into two categories. For a felony offense, a maximum $250,000 fine per individual and $500,000 per organization, and/or up to 5 years imprisonment for each violation of the Act can be assessed. A misdemeanor offense carries a maximum $100,000 fine per individual and $200,000 per organization, and/or up to 1 year imprisonment.
Now I’m not an attorney, so I’m hoping I’m reading this wrong, but to me (and my completely limited knowledge of the law) this is a technically possible scenario:
You buy a lemon for Ceasars at home with some friends. Unfortunately, last week Russia declared it illegal to possess lemons due to new Russian research that the rest of the world thinks is crazy. A Google search you made tipped off your local American authorities that you are breaking 16 USC 3370. Do Not Pass Go, Do Not Collect $200. Instead go to prison for 1 to 5 years after laying out up to $250k on a fine, unless you get an understanding judge.
Remember: Absent knowledge of the law is NOT exemption from the law. You are required to know and follow all the laws in your country, provice/state, county/region, municipality/city. The government never really tells you that this is physically impossible. I couldn’t possibly read laws as fast as legislature or consul write them, let alone catch up on centuries of already written laws and judicial interpretation.
Getting back to knowingly breaking the law. My mom had surgery a couple years back and ran out of Tylenol 3s (T3s are basically a small dosage of codeine with caffeine). Because I’ve had excruciatingly painful bi-yearly migraines since I hit adolescence I have an unlimited, legal supply of T3s. Personal use only, of course. But even though it was illegal, did I give my mom two or three T3s to keep her pain down until she could get her bottle refilled the next morning? You bet. I didn’t even blink. Was I trafficking narcotics (or whatever giving prescription drugs to other people is called)? You would have to ask a Canadian judge and jury that.
But luckily for me big brother doesn’t have a log of me giving my mom a couple pain killers.
This is why I want privacy. I break the law. Sometimes for good reasons, sometimes for stupid reasons. Now, I rarely knowingly break big laws, but I’m sure it has happened a couple of times. Have I ruined anyone’s life? No. Have I destroyed anyone’s wealth? No. Do I breach others privacy? Not since I was an idiot kid.
Then stop snooping. Leave me the hell alone. Maybe if I’m doing something online that I don’t want anyone to find out I should do it anyway, safe in the knowledge that I live in a free country and that my right to privacy is assured - unless I do something that gives the police enough evidence for a judge signed warrant.
(NOTE: This post is extremely old. I know this is horrible, horrible code, but I leave it up there for the kids.)
The challenge: calculate all numbers between 1900 and 2100 with exactly one operator between the sequential numbers 123456789 with as many brackets as desired.
Being a data guy I know a fair bit about Excel, MySQL, star schema data marts, statistics, and split tests - but very little on how to properly program in scripting languages; form, modularization, unit testing, etc… It is something I’m working on. I think that even though I came to the problem with a vastly inferior tool set for the job (as anyone could see from my unedited Ruby code) I was able to employ a couple of cool tricks to solve the problem.
When I first approached the problem many of the solutions had already been worked out by patio11 over at hacker news. He had calculated 192 of the 201 solutions. What I love about this problem is just how simply it can be stated. Yet, without some clever trick, it would take a long time to find out if there is even a solution for each number.
I believe I was the first to calculate 200 of the solutions - there are actually 201 solutions between 1900 and 2100, but when I finished coding the program I was so high on results coming in through the first pass that I completely missed that fact until I talked about the problem with Ben Coe the next day.
I decided to not even look at the code already posted in the discussion of the problem because I was afraid I would end up going down the same path as others before me.
I approached it like this:
Forget about trying to brute forcing the problem. If it were possible to brute force it in any reasonable amount of time, then it would be solved by now. (This turned out to be wrong.)
Brackets must be important, otherwise the solution would have been found by now. (This turned out to be right.)
Use randomness! The possible combinations are so large that even if you randomly assign operators and nested brackets a million times, who cares? The duplication rate would be so small that it didn’t really matter. (This turned out to be the cincher.)
Here was my first pass: http://pastie.org/776041 (you may notice me trying to find 1911, one of the unsolved ones when I first attempted the problem)
In order to avoid potential memory leaks (and to make sure my script was still running correctly) I just made a controlling processes that restarted the ruby script with ‘system “ruby numbers.rb”’ and counted the number of files in the storage directory with ‘system “ls -l num\ | egrep -c ‘^-‘”’. Here is where I should have checked for 201 files, rather than 200, but in the end it didn’t matter. I ended up rewriting “numbers.rb” to dynamically change the weighting of the operators, but it only marginally helped. What I should have done is rewrote it to dynamically change the weighting of the bracket placement rates. But despite these shortcomings I ended up finding the numbers in less than 8 hours of computing.
I told Toronto hacker friends and coworkers Ben Coe and Justin Giancola about the problem.
We talked genetic algorithms over beers, dismissing them, but on the bus ride home Ben came up with an amazing solution (which he implemented in Java, but it doesn’t really matter what language you choose). Later, Justin implemented a Common Lisp solution that has astonishingly little lines of code. Ben continues the story in Part Two of Three here: Ben’s Awesome solution. (If you’re a Lisp nerd and just want to jump to Part three of three by Justin it can be found here: Justin’s wicked small Lisp solution)
Rot13 Spoiler on Ben’s solution: Ora’f cebtenz pnyphyngrf gur nafjref svir gubhfnaq gvzrf snfgre guna zvar
The Elements - A Perfect Coffee Table Book for Nerds
Most of the time when I get books for Christmas from my nontechnically inclined friends and family I don’t end up reading them because they are either:
An uninteresting variety of This is how the future is going to look, or,
A book I already own (e.g. Alan Greenspan’s “new” book).
So while I was holding my unopened present that my girlfriends parents handed me this past Christmas Eve I was a little nervous. I knew it was a book (based on the density and flexibility of the present) and I knew there was a very high chance that I wasn’t going to like it.
Boy, was I wrong.
The Elements by Theodore Gray (co-founder of wolfram research) dedicates 2 to 4 pages to each known element of the universe (except really funny elements like roentgenium. Basically if your spell checker knows it, it has its own page). Here’s why the book is ridiculously awesome:
The quality of the photography and printing. Even Arsenic looks beautiful.
The thoroughness of the information about each element. Atomic weight, density, crystal structure, atomic emission spectrum, an easy to follow state of matter graph, etc.
The number of pictures per element. 13+ high quality shots of Uranium in various forms (including “Fiestaware” made prior to 1942. Ahhh, the good ol’days. A man could eat out of a radioactive bowl at the local Mexican food joint without someone telegraphing the police).
The light hearted and often hilarious way the author writes about each element and the remainder of the book. Take this passage from the end of the book: “[Element collecting is] best enjoyed in responsible moderation - keep too much uranium (92) in the office, and people start asking questions (keep over 15 pounds, and the Feds start asking questions)” or from elsewhere in the book: “Copper is wonderful stuff. Just wonderful. Many other elements have some kind of gotcha about them: maybe they are great in every way except they’re poisonous, or they would be perfect except they explode when they touch water. Copper has no gotcha - it’s just nice stuff all around.”
It’s a hard cover that comes with a tear out picture based table of elements on the last page. *shrug* I like hard covers and tear out pages, ok?
I’ve read it twice now, and it is a great book all round. It’s clear that the author is extraordinarily passionate about science and educating people. If you have doubts that a book about every element could ever be interesting, let alone the perfect coffee table book for nerds, take a look at: http://www.periodictable.com/theelements/pages.html. Here we find the authors very own website where you can browse though some of the pages (although, unfortunately, you can’t read them). He also has a fun little interactive online periodic table here: http://www.periodictable.com/.
Whenever I write about a book that I love I’ll be putting a Amazon referral link to help fund my reading addiction. In this case, however, I’m hoping you’ll buy the book from the Authors own site, even though it’s a little more. The author is clearly passionate about educating people and this will get him a little extra coin. Also, you’ll get an autographed copy of the book. How cool is that?