Tuesday, December 25, 2012

Designer genes - How the forces of natural selection are about to be replaced by the forces of human selection - Stephen Potter

Surveys developments in genetics over the last fifty years, in particular developments which have lead towards the possibility of genetic engineering of humans. These include:
  • The double helix model of the DNA (Watson/Crick, Nobel prize 1962)
  • The sequencing of the human genome (DNA sequencing, Sanger/Maxam/Gilbert (Nobel prize 1980), Protein sequencing (Sanger (Nobel prize 1958 )), Human Genome Project (Collins/Venter)
  • PCR: Creation of a large number of copies of a DNA sequence (Mullins, Nobel prize 1993)
  • Stem cells: Conversion of adult cells to stem cells (Yamanaka, Nobel prize 2010), allowing creation of multiple embryos by turning stem cells into gametes
  • Modification of the genes of a cell (Capecchi, Evans, Smithies, Nobel prize 2007)
Discusses the ethical implications of technology that commercializes and combines this research to allow human genetic engineering.

The current state of genetic engineering and what might be possible in the next few years:

The following technologies have been demonstrated in research. Commercialization of these technologies to reduce cost and increase speed is underway and advances are expected in the next few years:
  • Preimplantation Genetic Diagnosis (PGD): Identifying the presence or absence of a particular gene in an embryo at an early stage (8 cell blastocyte), by extracting a single cell from an 8 cell embryo from IVF.
  • Complete DNA sequencing on an embryo cell in a short time (order of hours)
  • Creation of thousands of embryos simultaneously through stem cells
  • Screening of multiple embryos in parallel
  • Modification of the genes of an embryo cell followed by implantation

Which genes do what

  • DNA: A long molecule consisting of sequences of bases (A, G, C, T)
  • Codon: A triplet of bases that codes for an amino acid. 4 bases =>4^3 = 64 possible codons
  • Amino acids: Building blocks of proteins:
    • 20 possible amino acids (out of 64 possible codons => 44 codons either do not specify amino acids or more than 1 codon maps to the same amino acid.)
  • Gene: Sequences of codons that code for protein generation
  • RNA: Long molecule consisting of bases ( A, G, C, U)
    • RNA can act as genetic material as well as a proteins. Might explain origin of life (Altman, Cech (Nobel prize 1989))
  • Proteins: Chain of amino acids (few hundred)  =>  >100^20 possible proteins
  • Generation of proteins from DNA:
    • Transcription: DNA used to generate mRNA, aided by protein RNA polymerase
      • RNA polymerase is a protein that takes one strand of DNA and transcribes a RNA copy (only the genes)
    • Translation: mRNA to Protein, happens in the ribosomes of the cell
  • Gene differences
    • Mutations: Deletion of block in DNA sequence-> Deletion of block in protein
    • Frame shift: Deletion of single base
    • Single base difference: Single nucleotide polymorphism (SNPs):
      • Can be in codons or non coding part
    • An individual has 2 copies of each gene: One from each parent

Sequencing the human genome

  • Genome: 3 billion bases
  • DNA per protein: 3 bases per amino acid, 100 -1000 amino acids per protein
    • => 3 billion/3000 = 1 million potential protein or genes per genome
  • Turns out there are only 30K genes in the human genome
    • => Approx 2.5%
    • Determined by transcription analysis of RNA
  • Differences in people = 0.1% of the genome i.e 3 million bases out of 3 billion

Sequencing revolution

  • Objective: Identify gene combinations (from the millions of SNPs in a single DNA sequence), that contribute to variation/disease
  • Map disease/traits to SNPs
  • 3 million differences between individuals
    • => Needs huge amount of data (samples from large population, mapped to diseases), computational power
  • As complexity of the genetic variation cause increases (more SNPs), the number of samples needed to identify it increase

Time scales

  • Genetic evolution can be rapid: Order of 1000s of years: Rapid, directed
  • E.g. Evolution of dogs from wolves directed by man

Gene expression

  • Transcription factors: Proteins that regulate transcription i.e expression of genes
    • Like all proteins, their generation is impacted by the genome
    • Can cause genetic cascades: Hundreds of genes altered in expression level
  • Introns: Interrupting codons: Sequence between codons that interrupt codons
    • Increased flexibility of gene expression
    • Transcription removes the introns, a process called RNA splicing
    • Exons: Expressed sequences
    • 2% of sequence are introns that regulate gene expression
  • Genetic regulatory network: Interwoven connection of genes, with some regulating
    • 3% of the genome is regulatory

Jumping genes

  • Transposable elements: Discrete parts of the chromosome, capable of moving form one chromosome position to the other (McClintock)
    • Enzymes allow the transposable elements to copy themselves, float around and attach to  a new place in the DNA sequence
    • Drosphilia Melanogaster: P elements detected in wild fruitfly DNA which was different from DNA extracted from fruitfly a few years previously/
  • Horizontal gene transfer: Viral DNA: Retroviruses can convert RNA to DNA using enzyme called reverse transcriptase (Temin/Baltimore, Nobel 1975)
    • Causes hybrid dysgenesis i.e. reduced fitness
    • Repressors
  • Transduction: Moving DNA material from one species to another
    • Balance between harmful effects (which are subject to survival of the fittest) and preferential replication). Can spread because of ability to outreplicate the competing genome sequences.
  • DNA: 2% coding, 3% gene expression, 30% parasitic transposable
    • Why is there >50% with no known purpose: Has evolution created this unused portion to mitigate the effects of transposable DNA?

Genetic disease

  • Every gene has two copies - one from each parent
  • A mutant gene in 2 parent => child has 25% chance of getting a bad genes
  • Nature vs. nurture: Minnesota Twins study: 70-80% of IQ is genetic
  • Genes have a surprising amount of contribution to psychological traits: love, faith


  • Gametes: Have 23 chromosomes
  • Chromosome: Long DNA molecule
    • Individual has 23 pairs of chromosomes: one set from each parent
    • One copy of each chromosome goes into sperm/egg during meiosis => 2^23combinations from a pair of parents
    • Identical twins: Monzygotic (single egg, fertilized by single sperm, divides after blastocyst stage)
    • Fraternal twins: Dizygotic (two eggs fertilized by different sperm)
    • Chimera: Fusion of multiple eggs (fusion of two dizygotic embryos)
  • When number of cells < 32 (?), cells can develop into any type of cell

Stem cells

  • Origin: 
    • Embryo cells
    • Adult stem cells: Bone marrow cells
  • Programming adult cells to become stem cells
    • History of development:
      • Gene expression appears to be controlled by a master switch and a genetic pyramid of hierarchy.
      • Genes at the top of the hierarchy control those below them
      • Homeobox controls the master blueprint i.e type of the species, e.g. fruitfly vs. mouse
      • Single genes can initiate extensive development programs e.g. growth of a leg drive expression
    • Genes can make adult cells revert to stem cells
      • 4 genes activated in a mouse cell reverted it to a stem cell (Yamanaka, Nobel 2012)
    • Implication: Egg cells can be created, increasing the number of samples available for selection (over the 500 created)

Gene modification

  • Technology to modify genes (Capecchi, Evans and Smithies, Nobel 2007)
  • Procedure:
    • Desired version of the gene is created in a test tube
      • Generated using DNA synthesis machine or recombinant DNA strategies
    • Synthetic gene introduced into stem cells grown in an incubator
      • Modified gene introduced into stems cells by electroporation
      • Process of change is not clearly understood, happens by DNA recombination similar to meiosis
    • One of these correctly engineered stem cells is used
      • Only 1 in a million stem cells can be used, need screening to detect the stem cells which are good
      • Screening done by polymerase chain reaction (PCR), similar to preclude used in DNA matching
    • Create large number of copies of the DNA sequence (Mullins, Nobel prize 1993)
    • Genetically altered stem cells added to a blastocyte
  • Stem cell cloning: Dolly the sheep, 1997, nucleus of a mammary gland

Ethical questions

  • Ontogeny recapitulates phylogeny: Development of the individual copies evolutionary history
    • E.g. In embryos a primitive pair of kidneys is formed followed by a more advanced pair, and then the final pair
    • Is the early embryo truly human?
  • Optimal gene combinations
    • Sickle cell gene provides resistance to malaria: Genes are tradeoffs, not binary decisions
    • Connection between artistic genius and mental illness
  • Appearance of the Foxp2 gene responsible for speech (absent in chimpanzees), coincides with explosion in rate of progress.
  • Is genetic engineering any different from eugenics, improvement of the gene pool via human selection

Thinking, Fast and Slow, Daniel Kahneman

Kahneman's work explains areas in behavioural economics, specifically prospect theory (how decisions are made when outcomes are probabilistic) and the effects of cognitive biases on choice. It explains the process by which people reach conclusions/make decisions and why the choices are often the wrong ones. The book has several examples that illustrate errors of judgement and choice in analytical situations, mainly the results of cognitive biases. The book covers research by Kahnemann and others from the 70s to present. Prior to the work described here,  one of the assumptions made by economists/social scientists in their research was that people are rational. Departures from rationality were believed to be functions of emotion. Kahnemann's research made the claim that departures from rationality are because of flaws in cognitive machinery i.e. cognitive biases. His work describes how the mind works based on recent developments in psychology. The mind is subject to the influence of heuristics, intuition and biases and its functioning can be explained by three models:
  •  A model of the mind consisting of two components:
    • System 1: Fast automatic thinking: By intuition or by expertise
    • System 2: Slow engaged thinking: Deliberation, algorithmic, measured
This model explains how and why humans reach erroneous conclusions when presented with simple mathematical choices. The book describes 10-15 heuristics and biases which cause System 1 to reach erroneous conclusions.
  • Two economic models of human behaviour called Econs (rational, selfish and invariant in tastes) and Humans (real people). Modern economic theory/ modelling is based on Econs which explains why economic models to date are flawed.
  • The Experiencing self and the Remembering Self: Two ways in which humans consider memories of events which cause incorrect decisions because of incorrect assessments of past experiences.
The work uses these models to illustrate how modern economic models are flawed and how human decision making is flawed when evaluating decisions involving risks.  

Part I: This section describes the systems.

  • The two systems:
    • System 1: Operates quickly , no effort, no voluntary control
    • System 2: Deliberate, requires attention, can reprogram System 1 for a specific task
    • The division of maximizes performance and minimizes effort.
  • Attention/Effort
    • It takes effort for System 2 to get engaged.
    • Law of least effort: A person will engage the system that allows the task to be performed with least effort.
    • Experts in any field are able to solve problems in their field using System 1.
  • Lazy control
    • System 2 is engaged less often that it should be, because of "laziness"
    • Cognitive load: Load placed on the mind because of System 2 being engaged in one task.
    • Ego depletion: Depletion of self control causes System 1 to be engaged because of cognitive load on System 2 on another task.
    • The nervous system consumes more glucose than most of the rest of the body
    • Unless explicit effort is made, an individual will favor using System 1 without engaging System 2
    • System 2 can be divided into 2 components:
      • Intelligence: IQ
      • Rationality: Immunity to bias
  • Association
    • Association (ideas or suggestion) affects System 1's perceptions/decisions
    • Priming affects System 1's perception/decision
  • Cognitive ease/Cognitive strain
    • Measure of an individuals current condition, can predict likelihood of using system 1 vs System 2
    • When in a state of cognitive ease, System 1 predominates
    • Cognitive ease can be brought on by association, priming
    • Cognitive strain can be brought on by associated difficulties (bad fonts e.g.)
  • Norms, causes
    • Past events can cause System 1 to believe in a norm i.e a stereotype, perception of normal behavior
    • The mind has a need to assign causality to events
    • System 1 is incapable of making correct conclusions about causality - it does not have the ability to think statistically
  • How conclusions are reached by System 1
    • Confirmation Bias: A deliberate search for confirming evidence
    • Halo effect: Tendency to reach erroneous conclusions in one dimension based on liking a person for another dimension
    • Limited evidence (WYSIATI): base errors, framing effects, overconfidence
  • How judgments happen in System 1 when inadequate information is provided
    • Neglect of information, use of basic assessments
  • How questions are answered:
    • Substitution: In case of a difficult question, individuals use heuristic to arrive at a simple problem which can be solved and substitutes it
    • Affect heuristic: Likes and dislikes determine beliefs about the world

Part II: Heuristics and Biases: This section lists a number of biases/heuristics/intuitive conclusions which cause System 1 to reach erroneous conclusions.

  • Law of small numbers:
    • Even researchers make mistakes on sample size: Sample size is low, even in research experiments. A small sample will exaggerate the effect of outliers.
    • System 1 believes it can see order, where randomness exists
    • Causal explanations of chance events are invariably wrong
    • Solution: When conducting experiments: De correlate results by averaging
  • Anchors
    • Providing an anchor when asking a question can influence the response: E.g would you contribute $100 to this cause? If not how much?
  • Availability
    • Availability of the memory of events, can influence perception of frequency of the events
    • Difficulty in remembering a large number of event is can alter perception of frequency, even if absolute number is higher
  • Impact of availability
    • Emotional tail wags the rational dog
    • Availability bias attempts to create a world that is simpler than reality
    • Availability cascade: Emotional response to availability and results in bias flowing into public policy
  • Representation bias:
    • Stereotyping used without examination of bias, or stats about accuracy of stereotypes
    • Base rate information will always be rejected when specific instance information is available
    • Always apply Bayesian analysis
  • Representation bias with varying degrees of information
    • System 1 often judges conditions with smaller population to be more likely than condition with a larger population because it satisfies a representation bias
  • Causes vs Statistics
    • Base rates are ignored, even causal statistics may not change deeply held beliefs
  • Regression to mean
    • Regression to the mean is often interpreted as a causal event
    • Regression and correlation are related concepts. Where correlation is not perfect, there will be regression to the mean
  • Taming intuitive predictions
    • Use correlation to obtain a prediction that lies between an intuitive prediction and the base rate
    • Unbiased predictions will not predict extreme cases, unless a lot of information is available
    • In some cases, such as venture capital, this may  be detrimental because they are searching for extreme cases

Part III: Overconfidence: Other reasons System 1 makes mistakes

  • Illusion of understanding
    • The mind creates an illusion of understanding by believing WYSIATI
    • Hindsight bias creates the illusion that outcomes were obvious and that decisions were obvious
    • Outcome bias affects the perception of decisions based on the results
    • Halo effect affects the perception of human decisions based on organization outcomes
  • Illusion of validity: A cognitive illusion
    • The illusion of skill/validity
    • Supported by a powerful professional culture
    • Hedgehogs and foxes: hedgehogs fit events to a single framework and predict based on that
    • Media favors appearance of hedgehogs in debates
  • Intuition vs Formulas
    • System 1 is influenced by several factors (priming etc. above)
    • The result is that statistical prediction will generally  outperform human expert prediction (Meehl, Clinical vs. Statistical prediction)
    • Humans tend to try to think outside the box, adding
    • When predictability is poor, inconsistency (generated by System 1) destroy predictive validity
    • Broken leg rule: Occurrence of outlier events impacts prediction
    • Combining predictors (averaging them) is better than a linear multiple regression algorithm
  • When can we trust expert intuition
    • Other school of thought: Neural Decision Making: Seeks to understand how intuition works (Gary Klein, Sources of Power)
    • Intuition : System 1 implements rapid pattern recognition with System 2 executing a deliberate process to make sure that the decision will work
    • Requirements:
      • An environment that is regular enought to be predictable
      • Prolonged practice at identifying the  regularities
      • E.g Chess players an rapidly and intuitively recognize a situation as weak or strong, but this needs approx 6 years of practice at 5 hrs/day
  • The outside view
    • Inside view vs. Outside view: Knowledge about an individual case makes an insider feel no need for the statistics of the case
    • Exhibited as a belief in the uniqueness of the case
    • Planning fallacy: Unrealistically close to best case
  • The engine of capitalism
    • Irrational optimism: Optimistic bias plays a dominant role in risk taking
    • Overconfidence in ones own forecast: An effect of System 1 and WYSIATI
    • Remedy: Prepare a premortem for all decisions: Assume that decisions made, result in a disaster. Write a postmortem

Part IV: Choice: What influences human choice

  • Bernoulli's errors
    • Humans vs. Econs
      • Econs: Rational, Selfish, Maximize utility, Tastes do not change
    • Utility theory (Bernoulli)
      • Prior to Bernoulli, outcomes of gambles were compared based on outcomes (expected values)
      • Bernoulli realized that people dislike risk and this was explained by diminishing marginal value of wealth
      • Assigned a utility to each value of wealth, though the increase in utility decreased as wealth increasing
      • Diminishing returns
      • Explains insurance: Risk is transferred from poor person (with higher loss of utility) to a richer person (lower loss of utility)
  • Prospect theory:
    • Utility theory has a flaw: Utility is not absolute, it depends on the reference point
    • Difference is utility can differ based on direct: Loss of $500 has greater neg utility that a gain of $500
    • Depends on increase/decrease: E.g $5M has a different utility if it is considered in the context of an increase from 1M to 5M or a decrease from $10M to $5M
    • Taking this into account, will result in different predictions for how willing a poor or rich person is willing to take risk
    • Conclusion: If all options are bad, people tend to prefer gambling/risk taking, else they  avoid risk
    • Prospect theory
      • How financial decisions are made:
      • Evaluation compare to a reference point: status quo
      • Diminishing sensitivity  to evation of changes
      • Loss aversion
      • Gain/loss vs. Psychological utility is an S curve, but not a symmetric curve
      • Problems: Does not account for regret,disappointment
  • Endowment effect
    • Decisions are impacted by whether a good is meant for exchange or for use
    • Psychological value of a good for use, such as a mug or an already possessed good can change the utility of selling it
  • Bad events
    • Loss aversion is with respect to a reference point
    • Not achieving a goal may be a loss, exceeding a goal may be a gain
    • Impacts negotiations, where parties fight harder to avoid losses than to make gains
    • In a negotiation both parties feel they have lost more than gained
    • The asymmetry between feeling of gain/loss impacts the feeling of fairness: Can impact  whether  customer choose to buy products whose prices have risen
    • Fairness: It is considered unfair to impose losses on a customer, relative to his reference point
    • Reference points cause a sense of entitlement
  • Fourfold pattern
    • Outweighing of Small probability events
    • Decision weights are not identical to probability weights
    • =>Expectations (weighing by probability) is flawed
    • Decisions are made based on decision weights not probabilities
    • Decisions weight = probability, p=0 and p=1, but d ne p for all other value (d <p or d>p depending on d)
    • p=0 is close to possibility and p=100 is close to certainty
    • Fourfold pattern: Gain/Loss vs. High/Low probability
    • The fourfold pattern shows how high/low probability of a gain or loss results in  acceptance/rejection of unfavorable/favorable outcomes in negotiations because of the  aversion to loss/hope of gain and consequent risk taking/aversion
  • Rare events
    • People overestimate probabilities of unlikely event
    • People overweight unlikely events
    • Vivid or alternative descriptions of events influence decision weights (1 in 1000 vs. 0.1%)
  • Risk policies
    • People tend to be risk averse in gains and risk taking in losses
    • Broad framing (the grouping of several decision problems into a single problem) can result in better decisions than narrow framing (separately deciding each problem).
    • Samuelson's problem: Aversion to a single gamble vs expected value of several hundred instances of the gamble
    • Since a life will consist of several such small gambles, it pays to take the small gambles
      • Gambles must be independent experiments
      • Gambles must not be excessive
      • Gambles must not be long shots
    • Loss aversion + narrow framing less to bad (risk averse) decision
    • E.g. individual managers are risk averse because they take individual decisions. A CEO frames the decisions broadly, and favors taking a risk, in the hope that statistically one of them will pay off
  • Keeping score
    • Disposition effect: A product of narrow framing: E.g the tendency to sell winning stock in preference to losing stock, because of the pain caused by acknowledging and closing a losing stock.
    • Sunk cost fallacy: Tendency to throw good money at a bad project in the hope of salvaging it
    • Regret/blame: People have strong reactions to an outcome produced by action, than to an outcome produced by inaction (regret)
    • There is an aversion to trading increased risk for any other advantage, even if the advantage is significantly more gainful than the risk
    • Regret/hindsight bias cause regretful feelings when moderate amount of though has gone into decisions
    • Think deeply and anticipate regret, or think little.
  • Reversals
    • Preference reversals: Preference can change when two choices are compared jointly vs. if they are presented singly
    • Frames and Reality
    • Losses cause stronger negative feelings than cost
    • Framing a decision can impact decisions: gallons per mile vs. miles per gallon

Part V: Two selves: How memories are assessed

  • Two selves
    • Experienced utility vs. Decision utility
    • Experiencing self vs. Remembering self
    • Experience expresses satisfaction of the whole experience, while remembering may only remember selected parts of the whole experience
    • Peak end rule: Intense events towards the end of an experience are remembered
    • Duration neglect: Durations of experiences are often forgotten while intensity is not
  • Life as a story
    • Duration neglect, peak end rule and the remembering self impact decisions
  • Experienced well being/Thinking about life
    • Measures of happiness  reflect the remembering self not the experienced self
    • Affective forecasting: The effect of recent significant memories on opinion
    • Focusing illusion: Nothing is as important as you think when you are thinking about it
  • Conclusions
    • System1/System2, Econs/Humans, Experiencing self/Remembering self

Monday, November 26, 2012

Crossing the chasm: Marketing and selling high tech products to mainstream customers - Geoffrey Moore

Discusses the difficulties that technology firms face in moving from use by early adopters to mass adoption by the  mainstream. The difficulties are discussed in the context of the technology adoption life cycle and the chasm between customer who are early adopters/innovators and those who are the early majority (pragmatists). Navigating this chasm is a period that is sufficiently different from both early growth and the later stages. Moore discusses this stage and how to manage it. The book defines the chasm, the nature of customers on either side of it, how the chasm is to be approached:
  •     Identifying a niche market to attack
  •     Defining the "whole" product
  •     Positioning the product in relation to its competition
  •     Execution of the attack through distribution and pricing
Concludes with a discussion of how a company must evolve and some of the issues that must be addressed (personnel/compensation) post chasm

Chapter 1: Chasm defined

  • Four type of customers
    • Early adopter/Innovators, Early majority, Late majority, Laggards
    • The population is distributed in a bell curve
    • Each of the four categories one standard deviation way from the mean
    • Normal technology adoption life cycle (TALC) moves from one segment to the next, left to right
  • Innovations are continuous or discontinuous
    • Technology innovations are often discontinuous
    • This has the effect of disrupting the TALC
  • Gaps/cracks between each segment
    • Innovator to early adopter
    • Early adopter to Early majority: Early adopter wants a change agent, Early majority wants a productivity improvement
    • Moving from early adopter to early majority is a move from a market with no reference/support to one with a well defined reference/support model
  • The chasm is the gap between the early adopter/innovator and the early majority

Chapter 2: Chasm examined

  • Innovators: Demand extensive information, but will support the product even if the product is half baked
  • Visionaries (Early adopter): Derive value from the the strategic leap forward, not the technology itself
    • Expect breakthrough, not improvement
    • Highly demanding, expect the"dream"
  • Pragmatists (Early majority): The large revenues reside with the pragmatists
    • Slow to make decisions, move only when they sense the market is moving
  • Late majority/Laggards

Chapter 3: Overview: How to cross the chasm

  • Sell to visionaries <-chasm-> Sell to early adopters
  • How to handle the chasm:
    • Take over a niche market
    • Company must be market driven not sales driven
    • Being sales driven in the chasm period is fatal
    • Problem: Leader like sales driven companies, not market driven companies
    • Provide the "whole" product
  • Market leadership - big fish, small pond
    • Growth will need word of mouth - spreading the customers dilutes this: 10 customer in 10 segments is worse than 3-4 customers in 3 segments
    • Strategic niches
      • Commit to the niche
      • Act locally, not globally
      • Target closed communities
  • Platforms vs. Applications
    • Products must take a vertical approach to cross the chasm i.e must become an application
    • Platforms enable mass market adoption - will help once the chasm is crossed

Chapter 4: Identify the point of attack

  •  This is a High Risk, Low data decision
    • Informed intuition better than analytical reasoning
  • Define target customer characterizations
    • Use case scenarios
    • Market development strategy checklist
      • Target customer, reason to buy, whole producer, partners, distribution, pricing, competition, positioning, next target customer
  • Size of the market:
    • Pick on someone your own size
  • Steps:
    • List library of target customer scenarios
    • Analyze, rank and decide
    • Commit to the point of attack

Chapter 5: Define the product

  • Whole product marketing
    • Whole product can be categorized based on satisfaction of requirements:
      • Generic, Expected, Augmented, Potential
    • A seemingly inferior product may actually be inferior only in the "generic", it may be superior in the "whole"
    • A whole product may need support:
      • Third parties usually do not contribute during the chasm
      • Need to form tactical alliances
  • Markets are an ecology of interrelated interests
  • Steps:
    • Develop a whole product diagram (donut)
    • Develop needed alliances/relationships

Chapter 6: Define the battle

  • Any force can defeat any other if the battle is defined
  • Create the competition:
    • Locate the product in a buying category which has established credibility with pragmatist buyers
    • Focus on the needs of the pragmatists: Use a Competitive Positioning Compass
      • Opinion/knowledge about technology (Specialist/Generalist)
      • Opinion about proposition (Supporter/Skeptic)C
    • Crossing the chasm (Move sales from Supporters of technology/proposition ->  to skeptics of technology/proposition.
    • Move from product metrics (fastest, easiest) to market metric (largest base, cost)
  • Positioning
    • In people heads, not in words
    • Pragmatists are conservative about changes in positioning
    • Positioning is about making a product easier to buy, not easier to sell
  • Process: Claim, Evidence, Communications, Feedback/adjustments
    • Pass the elevator test
    • For A, Who are dissatisfied with B, Our product is C, That provides D, Unlike E, We have assembled F
  • Proof: Market share, Alliance
  • Steps:
  • Focus product by defining competition
  • Define position

 Chapter 7: Launch the attack

  • Objective: Secure a channel into mainstream market with which pragmatist will be comfortable
    • Prioritize above revenue, profits, customer satisfaction
    • Motivate the channel
  • Customer oriented distribution, Distributor oriented pricing
  • Distribution: Direct selling, Retail selling to OEMs to VARs, System integrators
    • Can the channel create a relationship to the mainstream customer?
    • Direct selling is the vest to create the relationship, crossing the chasm
    • Retail fulfills a demand, rather than create it
    • VARs provide support
    • Price point between $10K and $75K is the hardest to sell
    • Products needs marketing and end support
    • VARs do not expand a market
  • Start with direct selling, move to suitable channel after awareness is created
  • Pricing: Customer oriented, vendor oriented, distribution oriented
    • Customer oriented:
      • Visionaries: High cost: Value based pricing
      • Pragmatists: Competitive based pricing
      • Conservatives: Low cost: Cost based pricing
    • Vendor oriented:
      • Internal costs drive pricing decisions
  • Distribution:
    • Price based on  market
    • Price for market leadership
  • Steps:
    • Define the distribution channels
    • Define the pricing model

Chapter 8: Conclusion

  • Post chasm enterprise bound by the commitments of the  pre chasm enterprise
    • Avoid making wrong commitments in pre-chasm stage
    • Post chasm enterprise: Purpose is to make money
      • Stop custom development and roll out generic product
    • Pre chasm enterprise: Purpose is proof of concept of product and small early revenues
      • Typical mistake: Promise of hockey stick growth post chasm
      • Reality: Staircase: cycles of slow growth, stagnancy and then rapid growth, caused by repeatedly crossing chasms in different market segments
  • Venture capitalist concerns:
    • How long till chasm is crossed. How long before reasonable profit from mainstream market?
    • Chasm can be crossed only when the whole product is built, may need a long time
      • Technologist: Adopt discipline of profitably from day one
        • Except:  When high entry barrier exists
        • Rapid development needed (land grab)
  • Composition of company needs to be different before and after the chasm in Engineering/Sales
  • Navigating the chasm: May need reorgs with new job descriptions to handle the shift

Monday, October 22, 2012

"DNA: A graphic guide to the molecule that shook the world" - Israel Rosenfield, Edward Ziff and Borin Van Loon

Discusses the DNA molecule including
  • A historical background of genetics prior to the discovery of the molecular structure of DNA
  • Chemical structure
  • Information storage
  • Information expression
  • Replication
  • Diversity
  • Related topics including cloning, sequencing, stem cells, epigenetics and  the origin of life.

The topics are covered in chronological order of discovery, with extensive background on the researchers involved in the discoveries. The following key discoveries are discussed in detail:
  • Watson Crick double helix model of the DNA molecular structure
  • Cricks Adapter Hypothesis for information expression
  • Operon model of regulation (Jacob/Monod) for information expression


Modern genetics begins with the discovery of the molecular structure of DNA. Prior to this, researchers had reached the conclusion that cell division along with nuclei fusion was responsible for the transmission of genetic information. It was known that through Mendel's research that phenotypes (observed properties) were the result of genotypes(genetic makeup) and that traits were randomly segregated during reproduction. A substance known as chromatin, which contained chromosomes, was known and it was known to originate from the nucleus. Chromatin was suspected to be involved in heredity, but it was believed that protein sequences  (and not chromosomes/chromatin/DNA) was the primary mechanism for information storage/transfer.

Chemical structure

DNA is a molecule that consists of bases, phosphodiester bonds and a sugar backbone.
  • Sugar backbone: The backbone is a chain of sugar molecules. The sugar molecules are comprised of C, H, O atoms connected by single bonds in a 5 sided ring structure (Ribose). The molecules are identical for RNA and DNA except that RNA sugar has OH in one location, while DNA has H in that location (Hence Deoxy-Ribose). Each sugar molecule that is connected to another sugar molecule via a  phosphodiester bond. Each sugar molecule is also connected to one base.
  • Phospdiester bond: Phosphorous atom surround by 4 Oxygen atoms, connects two Sugar molecules
  • Base: The base s a molecule comprised of C, H, O and N atoms connected by single/double bonds. There are 4 types of bases:
    • Adenine, Guanine, Cytosine and Thymine for DNA
    • Adenine, Guanine, Cytosine and Uracil for RNA
Some terms used to describe this structure:
  • Nucleotide: A nucleotide is a single base, sugar molecule and phospdiester molecule
  • Gene: A sequence of three nucleotides that code an amino acid.
This structure was discovered by analyzing images from XRay diffraction. The images, along with advances in XRay diffraction analysis at the time indicated the structure to be a double helix along with the constraints that bonding across the helix was restricted to A-T, G-C bonds. This supported earlier studies which indicated that the proportion of A, C,.G an T flowed a set of rules (A+G = C+T, A=T, G=C, called Chargaff's rules)

Information Storage

DNA encodes the information for generation of proteins. A gene (3 nucleotides) sequence, also called a codon, maps to a single amino acid. A sequence of genes encodes information for a sequence of amino acids i.e a protein. A gene can encode one of 4^3 = 64 possible amino acids. There are 20 known amino acids, so multiple different triplets (total x) encode the same amino acid. In addition there are special sequences which have special purpose, e.g. coding starts at a delimiter: ATG, and stop at a triplet which does not specify any amino acid (any of the remaining 64-20-1-x). In addition, there is a large quantity of junk DNA between know gene sequences, whose purpose is unknown.

Proteins are produced inside the ribosomes, the cell's protein factory. The basic process is
The process of protein generation from DNA occurs through the mechanisms of transcription and translation
  • Transcription: Information gets transferred form DNA to the Ribosome via mRNA. The RNA enzyme polymerase nwinds the DNA strand, into 2 strands called the sense and template strands. The template strand directs transcription (What happens to sense strands?) The template strand links to a growing messenger RNA (mRNA) strand by forming A-U, G-C bonds. The start of the sequence is marked by a start codon(AUG) and end is marked by a stop codon (UGA/UAA/UAG). After creation the mRNA moves to Ribosome.
  • Translation: Information from mRNA is used to create proteins via tRNA (adapters molecules from Crick's Adapter Hypothesis). tRNA bind temporally with mRNA, and also their amino acid. As tRNAs enter and leave the Ribosome, they leave behind a growing chain of amino acids, a protein. This continues until a stop codon in the chain.
Note some enzymes (such as polymerase) that help this process are themselves created by this process (a chicken and egg problem: which came first: the DNA template or polymerase?)

Information expression

This is an area that is under active research and seeks to answer the questions as to why the same strand of DNA in each cell can cause the cell to perform different function. The theory of the operon partly explains this. According to this theory, the DNA strand consist of sequences called gene repressors. The gene repressors may suppress expression of a gene by preventing creation of the mRNA. Information expression is vastly different depending on the type of organism:
  • Prokaryotes: No nucleus in cell, transcription/translation side by side in cell. Gene expression changes constantly
  • Eukaryotes: Nucleus in cell, transcription/translation are separated. Cells are highly specialized (e.g. liver cells vs. brain cells) i.e very differentiated in expression. How differentiation happens is a topic of research.


DNA replication happens during cell division. The DNA divides into 2 strands by unwinding.  Enzymes produce bases which bond to each strand, producing 2 new DNA strands.


Genetic diversity (different expressions within members of a single species) occurs mainly through reproduction.  In this process, genetic information is transmitted through chromosomes. A chromosome is a single double stranded DNA, although it is not a helix.  Chromosomes are formed from mitochondrial DNA which ceases to translate/transcript.  There are 23 kinds, in two copies all pair identical, except for XY in males) (XX in females).  Each chromosome contains specific genetic sequences. (Q:The sequence in each chromosome is random? How does this account for creation of different DNA creation in progeny? How is the DNA formed for chromosomes?) Experiments have shown that organisms are primitive as bacteria can diversify genetic information by this process. Other forms of diversity can be caused by
  • Mutations
  • Viruses

Other topics discussed include:

  • Cloning: The process consists of the following steps
    • Take nucleus from grown organism
    • Implant it in an enucleated egg
    • The resulting organism should have identical DNA
  • Sequencing
    • Techniques pioneered by Sanger
    • Map gene sequences to characteristics/traits
    • Map gene sequences to diseases
    • SNiPs (Single nucleotide polymorphisms): Responsible for disease
  • Stem cells
    • Mature cell can be converted to stem cells (Shinya Yamanaka)
  • Applications
    • Crime: DNA consists of Variable Number of Tandem Repeat (VNTR) sequences:  Non coding sequences, 9 to 80 bases long, repeated up to 30 times. These can be used to identify DNA with low probability of error.
    • Medical Research, Biotechnology
  • The selfish gene: Organisms exist to propagate DNA
  • Epigenetics, Origin of life

Sunday, October 7, 2012

Freakonomics - Steven Levitt/Stephen Dubner, 2005

 Conclusions that appear to be intuitive/obvious are actually logical fallacies of the type cum hoc ergo propter hoc (correlation does not imply causation). Disproving such arguments is fairly common in some problems in the science/engineering fields, where the conventional methods of doing it are thorough theoretical methods such as regression analysis or via Monte Carlo methods, empirical measurements using:
  •  An identification of all factors that impact the results of an experiment
  •  Repeated experiments under controlled conditions,varying each parameter separately while keep all others constant.
These techniques are either not applicable or are inadequate to apply to problems in sociology/economics, especially the kind of questions that Levitt seeks to answer which involve incomplete/inaccurate quantitative data. Techniques and arguments used include deep studies/interviews, arguments of logic  (where data is unavailable) and cross domain collaborations with experts from other fields.

An outline of the work:
  •  Study of incentives and cheating (Chapter 1)
  •  Study of information and effect of information asymmetry (Chapter 2)
  •  Study of correlation vs. causation or how conventional wisdom is often wrong (Chapter 3/4/5/6)
Chapter 1: Incentives and Cheating
  • Incentive: Mechanism to induce one behavior (favored) over another (unfavored) by providing a reward
  • Cheating: Mechanism to defeat an incentive: acquire reward while performing (unfavored) behavior
  • Three types of incentives: Moral, Social, Economic
  • Any clever incentive scheme will result in the creation (or attempts to create an) equally clever cheating scheme
  • Some conclusions from case studies:
    • Approx 90% of humans do not attempt to cheat particular systems despite ability to do so (bagel experiment)
    •  Cross correlation analysis can be used very successfully to detect cheating (Chicago public education system, Sumo wrestling)

Chapter 2: Information
  • Information asymmetry; When two parties to a transaction have vastly different degrees of expertise
  • Internet has reduced information asymmetry
  • Often exploited (real estate agents, car salesmen)
    • Can be extremely subtle: Terms in real estate ads, selling cars
    • Some revealed through correlation analysis
  • Information crime
    • Crimes committed by exploiting information asymmetry (Enron)
    • Difficult to discover, something drastic must happen

Chapter 3/4: Correlation vs. causation (Conventional wisdom)
  • Convention wisdom (CW) can be incorrect in several cases.
  • Dramatic effects can be caused by subtle, overlooked, non obvious events
  • Events can be explained by careful study of the correct causes
  • Case studies demonstrating this include:
  • CW: Drug dealers make  a lot of money
    • Reality: The structure of a drug dealing organization is similar to a corporation
    • Workers at the bottom have low wages, bad working conditions
    • Upper management keeps disproportionate share of profits
  • CW: Crime decreased in the 90s for several reasons,
    • Correlations and logic used to check a number of likely causes
    • Closest correlation is with the Roe vs. Wade case and the legalization of abortion

Chapter 5/6: Case studies on parenting
  • Correlation between parenting approaches and the future success of children
    • A number of parenting factors are examined including those which show correlation (positive/negative) with child's success and those which have no correlation
    • All factors  appear to be correlated in some way to status of parents (education level of parents/affluence etc.)
    • What a parent does is irrelevant compared to who the parent is (education level/affluence etc.)
  • Correlation between children names and future success of children
    • Choice of name and success are uncorrelated, even though name choice has very strong correlation with race
    • There is a strong correlation between name choice and parents' characteristics (not childs' future)

Sunday, September 30, 2012

Physics/Quantum physics

A summary of the basics of quantum physics derived from well known texts (e.g. Feynman's lectures) as well as popular books (e.g. books by Hawking and Greene), and primarily Susskind's lectures on quantum physics (part of his Theoretical Minimum series) at Stanford. The lecture series was divided into 7 topics:

  • System/state definition
  • Mathematical foundation
  • Basic principles of quantum mechanics
  • System evolution with time/change
  • Uncertainty
  • Entanglement

Topic 1. System/state definition

Approach which focuses around the definition of a system and its state. He describes a system whose state behaves according to certain rules (described below). The system behavior under these rules are studied using mathematics and the behavior is derived. The behavior that emerges is non intuitive with effects such as uncertainity and entanglement emerging from the math.

The basic rules have experimentally been shown to be the rules under which physical systems really behave at a quantum scale, so the emergent behavior described by the math must be true, no matter how non intuitive it is. Susskind doesn't discuss the reasons why the rules are true. It's a somewhat philosophical discussion, though he does say that you either
  • accept the rules as ground truth, and therefore accept the non intuitive consequences as the way our universe works, or,
  • believe that there exist hidden variables, behaving deterministically, responsible for the visible non deterministic behavior and you're free to continue to search for them, though almost a hundred years of experimental science has failed to find them
Here is the description of system/state and it's rules. Consider a system whose behavior is governed by two rules:
  • Measurement of state is probabilistic: Consider a system with two states (+1, -1) whose state can be measured by an apparatus. Any attempt to measure the state of the system using the apparatus gives a result of either +1 or -1. If the apparatus is used to try to measure the system by forcing the apparatus to measure possible intermediate states (between +1 or -1), the result of the measurement is still either +1 or -1, with the average of a large number of measurement samples converging to the value of an imaginary intermediate state E.g. Let say that +1 and -1 represented two orientations of the system at 180 degrees from each other. If the apparatus is aligned to the axis, the result is either +1 or -1 and will continue to be so, ad infinitum if nothing else changes. However if  the apparatus is not aligned to the axis, the result will still be either +1 or -1 and will change from measurement to measurement. However, the average of a large number of measurements taken using the apparatus not aligned to the axis will converge to the cosine of the angle between the apparatus and the axis. 
  • Measurement causes state change: Consider a system whose state (+1, -1) can be measured using an apparatus. Measure the state of the system, by aligning the apparatus and the system. The state is measured to be either +1 or -1 and will continue to be so ad infinitum, if nothing else is done to the state, except for the same measurements. However, if the apparatus is rotated through an angle less than 180 degrees, and a measurement is taken, then it may be either +1 or -1. If the apparatus is once again aligned to the system axis, and a measurement is taken, it will no longer necessarily be equal to the first set of measurements.
Its important to note that the first rule is not a limitation of the resolution of the apparatus, but a property of the system. The second rules implies that logical propositions (AND/OR) do not carry the same implications for this system as they do for classical physics. Temporal ordering of the operands will  influence the result of the operation. The uncertainty principle will follow from this rule.
This description of the system, which along with a mathematical foundation (Topic 2) will be used to derive the principles of  quantum physics (Topic 3).

The Startup Game: Inside the partnership between Venture Capitalists and Entrepreneurs William Draper, 2012

Part biography, part history lesson and part tutorial on the Silicon Valley venture capital industry. The book covers three topics:
  • The origins and evolution of the early venture capital industry in Silicon Valley (Chap 1)
  • The venture capital process, both in the early days and today (Chapters 2,3,7, 9)
  • Draper's career in areas other than venture capital (Chapters 4,5,6 and 8)
Describes the functioning of the VC industry along with a number of general guiding principles, some well known, others unique. He illustrates them by interspersing them with anecdotes from his experiences with other VCs, entrepreneurs, economists, politicians and investors.

Chapter 1 is a a history of the origins of VC in Silicon Valley.
The first venture capital firm in silicon valley was Draper Gaithner and Anderson (DGA), started in Palo Alto with $6 million in investment in the early 60s by William Draper's father. None of its initial investments could be described as technology - they ranged from defibrillators to aborted attempts to invest in real estate in Hawaii to dental floss dispensers(?). Its first technology ventures included Diablo, Century Data (disk drive technology) and Kasper instruments (semiconductor mask alignment). Credits an academic, Fred Terman, as responsible for the creation of the valley ecosystem, with an interesting observation on how Stanford made Terman its president around the same time that Yale's president, a history professor, canceled it's engineering program because Yale did not consider itself a "trade school".

Chapter 2 discusses the five areas (Funding, Teams, Pitch/Product/Market, Deals, Relationships) which shape how venture financing works.
  • Funders: Today's model for a VC firm is a small firm comprised of limited partners (who put in most of the funds) and general partners (who do most of the management). Draper describes how this model started with DGA, where they started with approx $6million from three general partners (including the Rockefeller family), three limited partners (D, G and A). The profit sharing agreement was 60% of profits to the limited partners, 40% (the carry) to general partners, who split the 40% equally. Limited partners were charged an annual fee of 2.5% for managing the fund by the managing partners. Today this model with a few tweaks to the numbers is still the prevalent model with a few additions: For tax reasons, general partners must invest 1% of money in a fund (tax reasons). Investments are made in a series of funds, with a termination date for each fund, in contrast to Sutter Hill ventures (Draper's second VC firm) fund which was an evergreen fund (and still is), where limited partners can opt out at specified dates. Today, this model is used to manage VC funds totaling approx. $200 billion (circa 2011). He also talks briefly about other sources Angels, Corporations, Venture debt.
  • Team/team evaluation: Draper's methods for evaluating founders/teams are somewhat loosely defined, intentionally, and he talks about why it varies from VC to VC. In general, he looks for references, leaders who know the field intimately, have run another company, and have ideas for new features/technologies/products/markets.
  • Pitch/product/market : The pitch is as much about selling the team/leader as it is about selling the product/market, but must cover at least the following:
    • How the product differentiates itself from competition
    • Market size
    • Capital requirements
  • Deal: The deal is what brings the funders and the team together. The goal is to value the company and decide on how much funding to provide and at what what cost.
    • In the early days (70s-80s) the method he used, was:
      • For a company with no profits, but good steady revenue, good prospects, the price was 1 year annual sales.
      • For a company making profit, the rule was 20 times profit (which is equal to 1 years sales if the profit is 5% after taxes).
    • The benchmarks for the first (commonly called A) round is to sustain the company for 1-2 years only. It should result in a 50-50 ownership split between VCs and entrepreneurs. Any dilution for employees is shared equally among VCs/entrepreneurs.
  • Relationship: The 3 stages of the early company are
    • Trial and Error: Mistakes are made, but can be corrected. This is when the technical team executes.
    • Reality: Reality sets in about market size and customer acquisition.
    • Rev it up/Close it down: Company survives or dies.
List of the top ten errors an entrepreneur makes, a useful list classified into categories:
  • Business development:
    • Overestimation of market size/customer acquisition rate
    • Unclear marketing plan
  • Communication:
    • Unclear elevator pitch
    • Approaching VCs incorrectly
  • Execution:
    • Underestimation of schedule/timelines
    • Over utilization of the entrepreneur
    • Inflexibility
    • Failing to cut costs when needed
    • No action during a recession
  • Team:
    • Board lacks diversity
Chapter 3 discusses Drapers' views on the characteristics he looks for in an entrepreneur: intelligence, education, energy, passion, expertise, integrity, with a anecdote illustrating the cost of vision with bad execution.

Chapter 7 discusses the exit of a venture. The goal of a venture is value creation followed by the exit. Here Draper discussed the IPO as an exit. His opinion is that an IPO should happen iff the company can deliver increasing profits and revenue for the next several years, so the public can make a gain on its investment. Any other consideration should be second to this criteria. He follows up by a short description of how the business of IPOs has changed in the last 10-20 years, starting with the four horsemen of investment banking and their take over by commercial banks as the Glass-Steagall act was repealed in 1999. He describes the general concepts of an IPO which at a high level is that companies sell a chunk of stock to underwriters (investment banks). Usually underwriters buy all of stock at a price set the day before IPO with an additional fees, typically 7%.