Wednesday, December 27, 2017

The Master Algorithm - Pedro Domingos

A description of the state of present day machine learning organized by the main techniques/algorithms, the domains from which the techniques originated, the history of the field and how the search for the master algorithm continues.

1. The Master algorithm

  • The AI algorithm that can replicate/reproduce all of human knowledge.
  • Areas that might yield the master algorithm: NeuroScience, Statistics, Evolution, Physics, Computer Science
  • 5 approaches to AI (and the fields that influenced them): Bayesian (Statistics), Symbolist (Inverse deduction) , SVM (Analogies), Connectionist (Neuroscience), Genetic programming/Evolutionary algorithms (Evolution)

2. The Humes Inference question

  • Can everything be inferred from limited knowledge? Can the past ever be used to accurately predict the future?
  • Is a Master Algorithm feasible?
  • Rationalism (All knowledge comes from reasoning)  vs empiricism (All knowledge comes from experimentation/observation)
  •  The Symbolist approach:
    •     Knowledge is just the manipulation of symbols
    •     Inverse deduction: Deduce complex rules from simple rules
    •     E.g. Decision tress

3. The Connectist approach

  • Neural networks:
    • Theory behind how the brain learns (Hebbs): Axions fire across neurons, reinforced every time they are fired, developing a memory
  • Perceptron: Weighted inputs + Thresholding function
    • Drawback: Can classify only when there is a linear boundary
    • E.g. XOR has three regions 00, 01, 10, 00
    • E.g. Gender + Age (0/1): How to classify if condition is true for Male/Young and Female/Old but not for other cases
  •  Neural networks: Multilayer perceptrons with backprop to learn
  •  Others: Autoencoders, Boltzmann machines, CNNs

4. The Evolutionary/Genetic approach

  • Based on genetic algorithms
  • A set of solutions, combined continuously in iterations, at each iteration, weakest solutions are discarded
  • Iteration continue until an optimal solution is reached.

5. The Bayesian approach

  • Based on Bayes' rule: Probability of cause of an effect, given probability of cause, probaility of effect, and probability of effect given cause.
  • P(cause|effect) = P(cause)*P(effect|cause)/P(effect)
    •    P(cause): Prior
    •    P(cause| effect): Posterior
    •    P(effect | cause)/P(effect): Support
    •    Imagine a Venn diagram: 4 areas: C, E, C and E, Not C/Not E => Four probs: E, C, E|C, C|E
  • Bayesian networks: Networks of Bayesian infererers: Used to infer probabilites of complex sequences
  • E.g. Hidden Markov Models, Kalman Filters (Continuous variable version of discrete HMMs)

6. The Analogist approach

  • Find similarities between examples
  • E.g.Nearest Neighbor, SVM

7. Unsupervised approaches: Learning without teaching

  • E.g. KMeans, Principal Component Analysis, Reinforcement learning, Relational learning

8. Unifying the algorithms

  • Metalearning: Techniques to combine approaches. All reduce variance
    • Random forests:Bootstrapping, Bagging, Boosting

Thursday, October 26, 2017

Little Bets - Peter Sims

Little bets:

  • Small decisions/goals/tasks
  • Large companies fail because they look for large billion dollar markets rather than experimenting with smaller markets that might grow: E.g: HP
  • Affordable loss principle: Make decisions based on  what you can afford to lose, rather than on expected gain

Growth mind set:

  • Large number of small attempts with multiple failures, rather than one large bet
  • Fixed mind set: Abilities/intelligence is fixed. Growth mind set: Results are determined by effort, not intelligence. 

Fail fast to learn fast:;

  • Healthy perfectionism: Driven by desire for excellence. Unhealthy perfectionism: Driven by fear of failure
  • Fail fast through little bets around prototypes

Genius of play:

  • Environments that lead to improvisation result in creativity
  • Plussing (Pixar): Idea evolution in a team is through a series of "ands" rather than "buts"

Problems are the new solutions:

  • Break large problem into smaller problems. E.g. Walt Disney Concert Hall, Agile development, McMaster's Iraq strategy

Questions are the new answers

  • Need to go out into the world and ask questions to find the problems. E.g. Grameen Bank, McMaster's Iraq strategies
  • Encourage voracious questioning

Learning a little from a lot

  • Learn from everyone, to get different persepctive
  • Build an open network of diverse people, and maintain it to constantly receive different perspectives

Learning a lot from a little

  • Seek out active users (early adopters). They provide 75% of improvements you will need

Small wins

  • Little bets can lead to small wins. Small wins can lead to successes.

Sunday, January 22, 2017

Confucius in 90 minutes - Paul Strathern

Confucius (Kungfutzu)

  • Circa 600BC in North central coastal China. 
  • Started a successful school for bureaucrats
    • Taught his philosophies of conduct and ethics. 
    • Students were often sons of rulers. 
  • Later in his life traveled through China,  meeting and advising rulers of various states.


  • A philosophy that evolved out of his teachings. 
  • Teachings were practical rather than religious or metaphysical. 
  • Dealt with the conduct and morality of rulers, bureaucrats, and citizens . 
  • Central premise: Ordinary activities of individuals are sacred and must be conducted in an ethical manner. 
  • Central concept: "jen" - a quality of magnanimity, virtue and honesty which every individual should strive for. 
  • The goal was to produce a society of individuals who live a life of harmony and virtue.
  • Contrast with the other major philosophy of the time, Taoism ("the way"), which dealt with metaphysics.
  • Encapsulated in pithy sayings documented in his books of sayings - the Analects. 
  • Confucianism has other sacred texts (some predated Confucius, others were edited by his followers):
    • Four books (one of which is the Analects)
    • Five Classics (among which are IChing: Book of Changes, dealing with metaphysics and the cosmos as an interaction of yin and yang, the Book of Poetry, the Book of History). 

    Sunday, November 24, 2013

    The Physics of Wall Street - James Owen Weatherall

    The evolution of quantitative trading through a history of the major principles involved in building the financial models, the researchers who proposed them, their impact and the reasons behind their failures. Each chapter deals with a single principle behind a model. Successive chapters follow the evolution of these principles, starting with the random walk model, progressing through delta/dynamic hedging, chaos theory, black box modelling and ending with extreme event detection through log periodic variations.

    Louis Bachelier, "A theory of speculation"

    • Bachelier's dissertation ("A theory of speculation") proposed that random walks could be used to model stock prices.
    • Valid if the trade is a fair bet. Intuitively a trade means that buyer believes information is positive, seller believes information is negative, therefore the trade price is the price at probability of going up == probability of going down. Equivalent logic to the Efficient Market Hypothesis (Fama, Chicago School)
    • If stock prices follow a random walk, probability of distance from starting point is normal, variance increases with time. Distribution of future price at a time is normal, mean is at the starting price, variance depends on  the time. As time increases, variance increases, normal curve becomes flatter
    • Bachelier extended the model to options/derivatives. Fair price of an option is price that would make it a fair bet. Used random walk to calculate probability of a future price, and derived a fair price estimate.
    • Basis for model is somewhat flawed, e.g if efficient market hypothesis was true, bubbles could not happen. Also, model was not fully validated with real data.

    Maury Osborne, "Brownian motion in the stock market".

    • If Bacheliers' hypothesis were right, stock prices would be normally distributed, which was not supported by real data. Osborne showed that returns were normally distributed, so stock prices follow a log normal distribution. Rate of returns follow a random walk (prices change by a fixed %age, not by a fixed amount) i.e. prices are log normally distributed
    • Has an intuitive basis: Investors do not care about absolute price, they care about rate of return. Also, from the Weber-Fechner psychological principle: logarithms model human response to stimuli
    • Hypothesis that markets are random seems to indicate that in the long term, investments will yield no gain, However, estimating future values of options can be used to develop instruments that yield a profit.
    • Later Osborne, rejected the memoryless efficient market hypothesis in favour of the memory based models:  after prices go up they are likely to go down and vice versa. Fundamental change in assumption from random walk

    Benoit Mandelbrot,  "The fractal geometry of nature"

    • Mandelbroit's work showed that real market returns are governed by Levy stable distributions with 1 < alpha <2 i.e long tailed distributions
      Extreme events occur much more often than predicted. Makes random walk based models obsolete.
    • Mandelbrots theories emerged around the same time that random walks were gaining traction in financial modeling, but were unable to gain much traction because of complexity/tractability.  Random walk gives good results "most" of the time. Long tailed models are not tractable.
    • Notes on  long tailed distributions
    • Levy-stable distribution: Alpha characterizes the tail.  Normal: 2, Cauchy 1, (<1 => distribution has no average). Self similar features have no average
    • Zipf's law: Frequency of occurrence of events related to ranking.
    • Pareto principle: 80:20 rule
    • Cauchy distribution: Long tailed distributions

    Edward Thorp, "Delta hedging"

    • Bachelier, Osborne, Mandelbrot did not apply their theories to real investments. Ed Thorp was the first to apply their theories to the market
    • Card counting strategy to achieve a favourable strategy for 21. If you have a strategy (edge) that is probabilistically profitable in the long run,  how can you estimate betting amount to avoid "Gamblers ruin". Thorp used Kelly'c criteria (information theory) Probability of likelihood of correctness when a message is distorted by noise. Calculated the optimal amount to gamble when betting in a favoured bet. Kelly criteria specifies fraction to bet given the advantage and payout. Shows that rate of return equals information rate
    •  Applied the strategy to options (warrants): 
      • Used Osborne/Bachelier equations to estimate how much a warrant should be worth. Thorp found most options were overpriced using pricing theory. This provided an edge in the warrant market (not the stock market). 
      • Used short selling of options to exploit the edge. Short selling:allows investors to bet against a stock, without owning it.
      • Thorp hedged short sale of warrants against underlying stock. - The first hedge fund.  Underlying stock protect against increase in option value. Protects against all but large changes in stock value.  Controls risk, but does not eliminate it.
      • Procedure: Fair price of option is price at which it is a fair bet.Assume stock prices are log normal, calculate option price, calculate  proportion of stocks and options to execute delta hedging
    • Even though hedging guarantees profit, long and short term profits are taxed differently, reducing profits 

    Fischer Black, "Black-Scholes-Merton options pricing model",  "Dynamic hedging"

    • CAPM: Capital Aset Pricing Model: Proposed model that assigned a price to risk. Linked risk and return via a cost benefit analysis of risk premiums
    • Dynamic hedging: It is always possible to construct a portfolio consisting of an asset and its option that is always risk free 
    • Procedure:  
      • Assume there exists a mix of stock/options to construct a risk free portfolio
      • Use CAPM to calculate risk free rate
      • Calculate price of options in order  to realize the risk free return
    • Allowed banks to construct options to sell them. Banks could sell options, and reduce risk, by buying corresponding asset
    • 1987 crash:Portfolio insurance based hedge fund. O'connor used a moidified BlackScholes model that accounted for long tail events- was not impacted by the 1987 crash

    The Prediction Company, Lorenz, "Chaos theory"

    • Lorenz developed Chaos theory -  Sensitive dependence of state on initial conditions
    • The Prediction Company was started by a group of physicists with expertise in chaos theory, and prediction algorithms. Their objective was to find the signal in the noise, applying understanding of chaos, genetic algorithms Developed statistical arbitrage on correlated assets e.g. Pairs trading, algorithms around voting for trades
    • Most significant contribution was Black box modelling - building balck boxes that predicted based on accuracy on past real data (training sets, etc.)
    • One premise was that markets are inherently unpredictable, obey efficient market hypothesis, which implies they should be impossible to predict. However if anamolies (e.g a stock price is away from its normal, expected value) are detected they can be exploited before market returns to equilibrium. Need computation power and speed to detect, take action

    Didier Sornette: "Self organisation"

    • Ruptures in physical systems result from a self organization of components. Self organisation:uncorrelated entities begin to join together in correlated behaviour
      Log periodic patterns predict ruptures. Used to predict breaking of water tanks, earthquakes. 
    • Specific crashes (Dragon Kings) may be caused by state of the market rather than a particular event. More extreme than long tailed events, may be predictable through log periodic observations. Self organisation is difficult to predict, has fractal properties, but log periodic behaviour in properties may indicate system is in 1a dragon king state
    • Predicted the 1997 Asian currency crash, 2000 dotcom crash

    New Manhattan Project

    • Gauge theory and its application in calculating a new CPI


    • Renaissance Technologies: 
      • Medallion Hedge Fund, approx 2500% return  (compared to 1700% Soros)
      • 40% over lifetime, compared to 20% (Berkshire Hathaway)
      • 80% return in 2008 during the crisis
      • One asset is usually a derivative
    •  Derivative:  Contract based on some kind of security: stock, bond, commodity
      • Objective: Reduce risk (historically with commodity futures), now with stock futures
    • Hedge fund: Counterbalanced protofclio comprising asset and its derivate. Calculate relationship between derivative prices and underlying asset price, quantify risk of a fund based on derivatives, keep portfolio in balance.
    • 1971: Chicago Securities Board allowed the first options market
    • Breton Woods 1944 agreement. Fixed exchange rate, all currently tied to dollar, dollar tied to gold. Abandoned by Nixon on recommendation from Milton Friedman (Chicago school). Currency futures became widely traded after this.
    • 1987 crash: Portofolio insurance: Hedge: Buying a stock, short sell futures. Volatility smile: Abnormality in options pricing graphs becuase of short comings of the Black Scholes model
    • 2007 crash: Banks needed an asset hat was like a treasury bond (low risk), that they could provide as collateral on deposits from corporations/other banks (shadow banking system). They used consumer debt (mortgage, credit card, student loans) - Collateral Debt Obligations (CDO). Shadow banking system collapsed when underlying assets became toxic. Mathematical models made a flawed assumption of the independence of failure of individual assets (mortgages). Failure was followed by run on the banks.

    Tuesday, March 12, 2013

    What to listen for in music - Aaron Copland,

    Summarizes the basics needed to understand and appreciate music at a reasonably deep level. It focuses mostly on western classical music, with some mention of jazz. It covers the process of listening and composing, the 4 major elements (rhythm, melody, harmony, tone color) and musical structure (4 major forms in western classical music).


    Most people have the prerequisites to developing an appreciation of music, though they may not be aware of it.
    • Short sequence recognition:  Ability to recognise a melody i.e. a short progression of notes 
    • Long  recognition: Ability to relate what happens in a section of music to what happened before and what happens after

      How we listen to music

      There are different ways in which one may attempt to listen to music:
      • Sensuous plane:Listening without thinking, a diversion.
      • Expressive plane: The feeling that the composer is striving to express, or the feeling that the listener feels. The meaning of the music. A controversial topic because of the difficulty in identifying what a musical work expresses.
      • Musical plane: The manipulation of the notes: sequences, combinations, speeds, patterns. This book deals with this plane

      The creative process 

      Music works are composed using different methods. Types of composers include:
      • Spontaneously inspired: Composers begin with a composition that is close to completion. E.g Schubert
      • Constructive: Continuously refinement of themes. E.g. Beethoven, as deduced from his notes.
      • Traditionalist: Starts with a pattern, rather than a theme. The pattern may be, e.g. the music style of the age/place. E.g. Bach
      • Pioneer: Opposite of traditionalist. Is experimental, adds new harmonies, new principles

      Elements of music

      4 essential elements:
      • Rhythm
      • Melody
      • Harmonic
      • Tone color


      Measured music system: 

      • Rhythmic units are divided into measures separated by bar lines
      • The bar line generally has 4 instants.
      • Number of notes between the bars is used to define the system: E.g. 2/4, 3/4, 5/4, 6/4
      • Stress/Accent: Some notes are stressed/accented (down beat)
      • Meter vs. Rhythm: The stressing of note defines the meter


      • Measured music system started around 1100 AD. Prior to that most music had rhythm that was based on words (Gregorian chants).
      • End of nineteenth century was when newer features started:
        • Combination meters (2/4 + 3/4) were used e.g. Tchaikovsky
        • Grouping of notes within a bars (2-3-2/8)
        • outside the bar
          •  Polyrhythms Two simultaneous different rhythms, e..g 2/4 coincides with 3/4
          •  Sometimes with non coinciding first beats (length of musical unit is different?). E.g one rhythm is 2/4, which overlaps with 3/4
          • Frequently used in Chinese, Hindustani, African music, madrigals (rhythms from words)


      • Progression of notes in time, has a skeletal frame
      • Exists within a scale system
        • Scale: Set of notes between a tone and its octave
        • Octave: 12 equal semitones,
        • CC#DD#EFF#GG#AA#BC
      • Chromatic scale 
        • 12 semitones, i.e. all notes
        • CC#DD#EFF#GG#AA#BC
      • Diatonic scale
        • 7 semitones from the 12: 2 whole tones, half tone, 3 whole tones, half tone
        • 12 possibilities,starting with each semitone
        • Starting tone is called the key or tonic
        • Key may be major or minor mode (?): 12 scales in major mode, 12 in minor mode
        • CDEFGABC
      • Four scale systems:
        • Oriental, Greek, Eccelesiatical, Modern
      • Scales center around the tonic, dominant order is 5th, 4th, 7th degree is the leading tone (leads to tonic)


      Started in the ninth century
      • Organum: Same melody repeated at a 4th or 5th interval above or below
        • Interval: Distance between two notes
      • Descant: Two independent melodies moving in opposite directions
      • Faux bourdon: Intervals of 3rd and 6th
      • All chords are built from the tonic, upwards in a series of intervals of a 3rd
      • Triad chord: 1-3-5, 7th chord:  1-3-5-7, 9th (1-3-5-7-9), 11th (1-3-5-7-9-11), 13th (1-3-5-7-9-11-13)
      • Return to the tonic is a principle in all early harmonic work
      • More recent developments:
        • Atonality: Feeling of central tone lost (Wagner), Abandoning tonality (Schoenberg, Debussy). Opens questions of consonance, dissonance
        • Polytonality: Use of multiple tones (right hand plays in one key, left hand in the other)
        • Most work today is diatonic and tonal

       Tone color (or timbre)

      • Quality of sound from the medium e.g. musical instrument, or voice
      • There is a characteristic way of writing for each instrument
      • Single tone colors: Sections of an orchestra
        • Strings: Violin. viola, cello, bass
        • Woodwind: Flute, oboe, clarinet, basoon
        • Brass: Horn, Trumpet, Trombone, Tuba
        • Percussion: Drums
      • Mixed tone colors:
        • Combination of single tone instruments
        • Sting quartet: 2 violins, viola, cello
        • Melodic line passes from one section to another in an orchestra
      • Jazz: Some instruments provide rhythm (piano, bass, percussion), other harmonic texture, one solo instrument plays the melody

      Music Texture

      • Monophonic: Single melodic line, No harmony. E.g Chinese, Hindustani, Gregorian chants
      • Homophonic: Principal melodic line + Chordal accompaniment
        • Contrapuntal view: Two separate melodies progressing in time
      • Polyphonic: Separate and independent voices in the chordal progressions
        • 2-3 polyphonic voices can be perceived independently
        • E.g. Choral prelude (Bach), Jesu

      Music structure

      • Structural background of a lengthy piece of music. Various structures (sonata, fugue) have evolved over years.
      • Sections have a hierarchy. 
        • Large sections denoted by upper cases (A-B-C etc) called movements or sections
        • Smaller sections denoted by lower case (a-b-c..). Analogous to sections and chapters in a book. The classification is made based on how repetition happens

      Larger sections:
      • Exact repetition
      • Sectional (Symmetrical ) repetition: 2, 3 part, rondo ,free sectional
      • Variation: Basso ostinato, passacaglia, chacome, theme
      • Fugal: Fugue, Concerto grosso, Chorale prelude, Motets & madrigals
      • Development: Sonata
      • Free
      Smaller sections:
      • Exact: a-a-a-a
      • Minor alterations: a-a'-a''-a'''
      • Repetition after digression: a-b-a, a-b-a'
      • Non repetition:a-b-c-d

      Fundamental forms I: Sectional form

      Work is divided into distinct sections
      • 2 part form: A-B-A-B. E.g. Scarlatti's sonata, No 413 (Dminor), 104 ( C major), 338 (G minor)
      • 3 part form: A-B-A, B is sometime called the trio, A is the minuet. Nocturne, ballad, elegy, waltz, intermezzo, are likely to be 3 part forms E.g. Minuets of Haydn (String quartet, Op 17, No 5) and Mozart. Beethoven's Scherzo (Piano Sonata Op 27 No 2)
      • Rondo: A-B-A-C-A-D-A-.... i.e. sections separated by return to A. E.g. Haydn's Piano Sonata No 7 in D Major
      • Free sectional form: Any arrangement, e.g. A-B-B, A-B-C-A Chopin's Prelude in C Minor, No 20

      Fundamental forms II: Variation form

      Piece is composed as a set of variations on a theme:
      • Basso ostinato: Short phrase repeated over and over in the bass section, while upper parts proceed, E.g. Soldier's violin form Stravinsky's The Story of a Soldier
      • Passacaglia: Repeated bass part, but the bass part is a melodic phrase, not a figure, with some variation in each section, the work starts with unaccompanied bass theme. E.g. Bach's organ Passacaglia in C minor
      • Chaccone: Very similar to Passacaglia, no starting unaccompanied bass theme, so sounds like the first variation of a Passcaglia. E.g. last movement of Brahm;s Fourth symphony
      • Theme and variations: Variation of a simple, direct theme. Theme is usually a 2 or 3 part form. Five types of variation: Harmonic, Melodic, Rhythmic, Contrapuntal (Combination). E.g. Mozart's A major Piano Sonata: Theme and six variations. Variation 1 s a florid melodic variation, Variation 4 is a skeletonizing of the harmony, Variation 3 is major key to minor key harmony change

      Fundamental forms III: Fugal form

      • Polyphonic/Contrapuntal in texture: Separate strands of melody concurrently. Needs repeated listening to be able to acquire the skill to differentiate the strands. Types of contrapuntal devices:
        • Imitation: Voices follow a leader, may enter at a different note. Only one melody, but spaced in time.
        • Canon: Imitation from beginning to end of piece
        • Inversion: Melody inverted, one voice follows the melody in the opposite direction. E.g. when the original moves one octave forward, the inverted one, moves an octave downward
        • Augmentation: Double time value of notes, slowing it down
        • Diminution: Halves the time values of notes
        • Cancrizans: Melody read backward
        • Inverted cancrizans: Melody backward, then inverted
      • Types:
        • Fugue proper: 3-4 voices
          • First voice enters, Second voice enters, First voice adds a counter melody,, then starts a free voice,
          • Exposition, Subject, Subject, ...Stretto, Cadence
            • E.g. Bach, Well Tempered Clavichord
        • Concerto Grosso:
          • Two groups of instruments: Large (Tutti) and smaller (Concertino) E.g. Bach's Brandenberg Concerti (6 each having a different concertino)
        • Chorale prelude: Originated in choral works in Churches. Melody is kept intact, harmonies are made complexer. Bach's Orgelbuchlein
        • Motets/madrigals: Choral forms, Vocal fugal form. Motet is based on scared words, madrigal on secular works

      Fundamental forms IV: Sonata form

      • 3 or 4 movements (fast-slow-fast, fast-slow, moderately fast, very fast)
        • Created by Karl Bach (JS Bach's son) (Prior to Bach, a sonata was a instrumental work, contrasting with the vocal cantata)
        • 1st movement: Sonata Allegro:
          • 3 parts (ABA):
          • Exposition (abc): First theme is in tonic, dramatic, second theme is feminine, in dominant, closing them in in dominant
          • Development: Free section, combines material in the exposition, new and foreign keys
          • Recapitulation: Repeats exposition but in dominant key
        • 2nd movement: Slow movement, may be a slow Rondo
        • 3rd movement: Minuet or scherzo, A-B-A, three part form
        • 4th movement: Extended rondo or in sonata allegro
        • Sometimes preceded by introduction and followed by a coda. E.g. Beethoven's Waldstein Sonata
      • Symphony: Sonata for orchestra: E.g. Beethoven's 9 symphonies
      • String quartet: Sonata for 4 strings
      • Concerto: Sonata for solo instrument + Symphony
      • Overtures: First movements of a sonata

      Fundamental forms IV: Free forms

       Does not belong to above structures
      • E.g. Preludes (for Piano). E.g. Bach's prelude, fugue
        • Clear progression of chordal harmonies from beginning to end without repetition of any themes. E.g Bach's B minor Prelude in Well Tempered Clavichord
      • E.g. Symphonic poems: Program music (as opposed to absolute music)

      Tuesday, December 25, 2012

      Designer genes - How the forces of natural selection are about to be replaced by the forces of human selection - Stephen Potter

      Surveys developments in genetics over the last fifty years, in particular developments which have lead towards the possibility of genetic engineering of humans. These include:
      • The double helix model of the DNA (Watson/Crick, Nobel prize 1962)
      • The sequencing of the human genome (DNA sequencing, Sanger/Maxam/Gilbert (Nobel prize 1980), Protein sequencing (Sanger (Nobel prize 1958 )), Human Genome Project (Collins/Venter)
      • PCR: Creation of a large number of copies of a DNA sequence (Mullins, Nobel prize 1993)
      • Stem cells: Conversion of adult cells to stem cells (Yamanaka, Nobel prize 2010), allowing creation of multiple embryos by turning stem cells into gametes
      • Modification of the genes of a cell (Capecchi, Evans, Smithies, Nobel prize 2007)
      Discusses the ethical implications of technology that commercializes and combines this research to allow human genetic engineering.

      The current state of genetic engineering and what might be possible in the next few years:

      The following technologies have been demonstrated in research. Commercialization of these technologies to reduce cost and increase speed is underway and advances are expected in the next few years:
      • Preimplantation Genetic Diagnosis (PGD): Identifying the presence or absence of a particular gene in an embryo at an early stage (8 cell blastocyte), by extracting a single cell from an 8 cell embryo from IVF.
      • Complete DNA sequencing on an embryo cell in a short time (order of hours)
      • Creation of thousands of embryos simultaneously through stem cells
      • Screening of multiple embryos in parallel
      • Modification of the genes of an embryo cell followed by implantation

      Which genes do what

      • DNA: A long molecule consisting of sequences of bases (A, G, C, T)
      • Codon: A triplet of bases that codes for an amino acid. 4 bases =>4^3 = 64 possible codons
      • Amino acids: Building blocks of proteins:
        • 20 possible amino acids (out of 64 possible codons => 44 codons either do not specify amino acids or more than 1 codon maps to the same amino acid.)
      • Gene: Sequences of codons that code for protein generation
      • RNA: Long molecule consisting of bases ( A, G, C, U)
        • RNA can act as genetic material as well as a proteins. Might explain origin of life (Altman, Cech (Nobel prize 1989))
      • Proteins: Chain of amino acids (few hundred)  =>  >100^20 possible proteins
      • Generation of proteins from DNA:
        • Transcription: DNA used to generate mRNA, aided by protein RNA polymerase
          • RNA polymerase is a protein that takes one strand of DNA and transcribes a RNA copy (only the genes)
        • Translation: mRNA to Protein, happens in the ribosomes of the cell
      • Gene differences
        • Mutations: Deletion of block in DNA sequence-> Deletion of block in protein
        • Frame shift: Deletion of single base
        • Single base difference: Single nucleotide polymorphism (SNPs):
          • Can be in codons or non coding part
        • An individual has 2 copies of each gene: One from each parent

      Sequencing the human genome

      • Genome: 3 billion bases
      • DNA per protein: 3 bases per amino acid, 100 -1000 amino acids per protein
        • => 3 billion/3000 = 1 million potential protein or genes per genome
      • Turns out there are only 30K genes in the human genome
        • => Approx 2.5%
        • Determined by transcription analysis of RNA
      • Differences in people = 0.1% of the genome i.e 3 million bases out of 3 billion

      Sequencing revolution

      • Objective: Identify gene combinations (from the millions of SNPs in a single DNA sequence), that contribute to variation/disease
      • Map disease/traits to SNPs
      • 3 million differences between individuals
        • => Needs huge amount of data (samples from large population, mapped to diseases), computational power
      • As complexity of the genetic variation cause increases (more SNPs), the number of samples needed to identify it increase

      Time scales

      • Genetic evolution can be rapid: Order of 1000s of years: Rapid, directed
      • E.g. Evolution of dogs from wolves directed by man

      Gene expression

      • Transcription factors: Proteins that regulate transcription i.e expression of genes
        • Like all proteins, their generation is impacted by the genome
        • Can cause genetic cascades: Hundreds of genes altered in expression level
      • Introns: Interrupting codons: Sequence between codons that interrupt codons
        • Increased flexibility of gene expression
        • Transcription removes the introns, a process called RNA splicing
        • Exons: Expressed sequences
        • 2% of sequence are introns that regulate gene expression
      • Genetic regulatory network: Interwoven connection of genes, with some regulating
        • 3% of the genome is regulatory

      Jumping genes

      • Transposable elements: Discrete parts of the chromosome, capable of moving form one chromosome position to the other (McClintock)
        • Enzymes allow the transposable elements to copy themselves, float around and attach to  a new place in the DNA sequence
        • Drosphilia Melanogaster: P elements detected in wild fruitfly DNA which was different from DNA extracted from fruitfly a few years previously/
      • Horizontal gene transfer: Viral DNA: Retroviruses can convert RNA to DNA using enzyme called reverse transcriptase (Temin/Baltimore, Nobel 1975)
        • Causes hybrid dysgenesis i.e. reduced fitness
        • Repressors
      • Transduction: Moving DNA material from one species to another
        • Balance between harmful effects (which are subject to survival of the fittest) and preferential replication). Can spread because of ability to outreplicate the competing genome sequences.
      • DNA: 2% coding, 3% gene expression, 30% parasitic transposable
        • Why is there >50% with no known purpose: Has evolution created this unused portion to mitigate the effects of transposable DNA?

      Genetic disease

      • Every gene has two copies - one from each parent
      • A mutant gene in 2 parent => child has 25% chance of getting a bad genes
      • Nature vs. nurture: Minnesota Twins study: 70-80% of IQ is genetic
      • Genes have a surprising amount of contribution to psychological traits: love, faith


      • Gametes: Have 23 chromosomes
      • Chromosome: Long DNA molecule
        • Individual has 23 pairs of chromosomes: one set from each parent
        • One copy of each chromosome goes into sperm/egg during meiosis => 2^23combinations from a pair of parents
        • Identical twins: Monzygotic (single egg, fertilized by single sperm, divides after blastocyst stage)
        • Fraternal twins: Dizygotic (two eggs fertilized by different sperm)
        • Chimera: Fusion of multiple eggs (fusion of two dizygotic embryos)
      • When number of cells < 32 (?), cells can develop into any type of cell

      Stem cells

      • Origin: 
        • Embryo cells
        • Adult stem cells: Bone marrow cells
      • Programming adult cells to become stem cells
        • History of development:
          • Gene expression appears to be controlled by a master switch and a genetic pyramid of hierarchy.
          • Genes at the top of the hierarchy control those below them
          • Homeobox controls the master blueprint i.e type of the species, e.g. fruitfly vs. mouse
          • Single genes can initiate extensive development programs e.g. growth of a leg drive expression
        • Genes can make adult cells revert to stem cells
          • 4 genes activated in a mouse cell reverted it to a stem cell (Yamanaka, Nobel 2012)
        • Implication: Egg cells can be created, increasing the number of samples available for selection (over the 500 created)

      Gene modification

      • Technology to modify genes (Capecchi, Evans and Smithies, Nobel 2007)
      • Procedure:
        • Desired version of the gene is created in a test tube
          • Generated using DNA synthesis machine or recombinant DNA strategies
        • Synthetic gene introduced into stem cells grown in an incubator
          • Modified gene introduced into stems cells by electroporation
          • Process of change is not clearly understood, happens by DNA recombination similar to meiosis
        • One of these correctly engineered stem cells is used
          • Only 1 in a million stem cells can be used, need screening to detect the stem cells which are good
          • Screening done by polymerase chain reaction (PCR), similar to preclude used in DNA matching
        • Create large number of copies of the DNA sequence (Mullins, Nobel prize 1993)
        • Genetically altered stem cells added to a blastocyte
      • Stem cell cloning: Dolly the sheep, 1997, nucleus of a mammary gland

      Ethical questions

      • Ontogeny recapitulates phylogeny: Development of the individual copies evolutionary history
        • E.g. In embryos a primitive pair of kidneys is formed followed by a more advanced pair, and then the final pair
        • Is the early embryo truly human?
      • Optimal gene combinations
        • Sickle cell gene provides resistance to malaria: Genes are tradeoffs, not binary decisions
        • Connection between artistic genius and mental illness
      • Appearance of the Foxp2 gene responsible for speech (absent in chimpanzees), coincides with explosion in rate of progress.
      • Is genetic engineering any different from eugenics, improvement of the gene pool via human selection

      Thinking, Fast and Slow, Daniel Kahneman

      Kahneman's work explains areas in behavioural economics, specifically prospect theory (how decisions are made when outcomes are probabilistic) and the effects of cognitive biases on choice. It explains the process by which people reach conclusions/make decisions and why the choices are often the wrong ones. The book has several examples that illustrate errors of judgement and choice in analytical situations, mainly the results of cognitive biases. The book covers research by Kahnemann and others from the 70s to present. Prior to the work described here,  one of the assumptions made by economists/social scientists in their research was that people are rational. Departures from rationality were believed to be functions of emotion. Kahnemann's research made the claim that departures from rationality are because of flaws in cognitive machinery i.e. cognitive biases. His work describes how the mind works based on recent developments in psychology. The mind is subject to the influence of heuristics, intuition and biases and its functioning can be explained by three models:
      •  A model of the mind consisting of two components:
        • System 1: Fast automatic thinking: By intuition or by expertise
        • System 2: Slow engaged thinking: Deliberation, algorithmic, measured
      This model explains how and why humans reach erroneous conclusions when presented with simple mathematical choices. The book describes 10-15 heuristics and biases which cause System 1 to reach erroneous conclusions.
      • Two economic models of human behaviour called Econs (rational, selfish and invariant in tastes) and Humans (real people). Modern economic theory/ modelling is based on Econs which explains why economic models to date are flawed.
      • The Experiencing self and the Remembering Self: Two ways in which humans consider memories of events which cause incorrect decisions because of incorrect assessments of past experiences.
      The work uses these models to illustrate how modern economic models are flawed and how human decision making is flawed when evaluating decisions involving risks.  

      Part I: This section describes the systems.

      • The two systems:
        • System 1: Operates quickly , no effort, no voluntary control
        • System 2: Deliberate, requires attention, can reprogram System 1 for a specific task
        • The division of maximizes performance and minimizes effort.
      • Attention/Effort
        • It takes effort for System 2 to get engaged.
        • Law of least effort: A person will engage the system that allows the task to be performed with least effort.
        • Experts in any field are able to solve problems in their field using System 1.
      • Lazy control
        • System 2 is engaged less often that it should be, because of "laziness"
        • Cognitive load: Load placed on the mind because of System 2 being engaged in one task.
        • Ego depletion: Depletion of self control causes System 1 to be engaged because of cognitive load on System 2 on another task.
        • The nervous system consumes more glucose than most of the rest of the body
        • Unless explicit effort is made, an individual will favor using System 1 without engaging System 2
        • System 2 can be divided into 2 components:
          • Intelligence: IQ
          • Rationality: Immunity to bias
      • Association
        • Association (ideas or suggestion) affects System 1's perceptions/decisions
        • Priming affects System 1's perception/decision
      • Cognitive ease/Cognitive strain
        • Measure of an individuals current condition, can predict likelihood of using system 1 vs System 2
        • When in a state of cognitive ease, System 1 predominates
        • Cognitive ease can be brought on by association, priming
        • Cognitive strain can be brought on by associated difficulties (bad fonts e.g.)
      • Norms, causes
        • Past events can cause System 1 to believe in a norm i.e a stereotype, perception of normal behavior
        • The mind has a need to assign causality to events
        • System 1 is incapable of making correct conclusions about causality - it does not have the ability to think statistically
      • How conclusions are reached by System 1
        • Confirmation Bias: A deliberate search for confirming evidence
        • Halo effect: Tendency to reach erroneous conclusions in one dimension based on liking a person for another dimension
        • Limited evidence (WYSIATI): base errors, framing effects, overconfidence
      • How judgments happen in System 1 when inadequate information is provided
        • Neglect of information, use of basic assessments
      • How questions are answered:
        • Substitution: In case of a difficult question, individuals use heuristic to arrive at a simple problem which can be solved and substitutes it
        • Affect heuristic: Likes and dislikes determine beliefs about the world

      Part II: Heuristics and Biases: This section lists a number of biases/heuristics/intuitive conclusions which cause System 1 to reach erroneous conclusions.

      • Law of small numbers:
        • Even researchers make mistakes on sample size: Sample size is low, even in research experiments. A small sample will exaggerate the effect of outliers.
        • System 1 believes it can see order, where randomness exists
        • Causal explanations of chance events are invariably wrong
        • Solution: When conducting experiments: De correlate results by averaging
      • Anchors
        • Providing an anchor when asking a question can influence the response: E.g would you contribute $100 to this cause? If not how much?
      • Availability
        • Availability of the memory of events, can influence perception of frequency of the events
        • Difficulty in remembering a large number of event is can alter perception of frequency, even if absolute number is higher
      • Impact of availability
        • Emotional tail wags the rational dog
        • Availability bias attempts to create a world that is simpler than reality
        • Availability cascade: Emotional response to availability and results in bias flowing into public policy
      • Representation bias:
        • Stereotyping used without examination of bias, or stats about accuracy of stereotypes
        • Base rate information will always be rejected when specific instance information is available
        • Always apply Bayesian analysis
      • Representation bias with varying degrees of information
        • System 1 often judges conditions with smaller population to be more likely than condition with a larger population because it satisfies a representation bias
      • Causes vs Statistics
        • Base rates are ignored, even causal statistics may not change deeply held beliefs
      • Regression to mean
        • Regression to the mean is often interpreted as a causal event
        • Regression and correlation are related concepts. Where correlation is not perfect, there will be regression to the mean
      • Taming intuitive predictions
        • Use correlation to obtain a prediction that lies between an intuitive prediction and the base rate
        • Unbiased predictions will not predict extreme cases, unless a lot of information is available
        • In some cases, such as venture capital, this may  be detrimental because they are searching for extreme cases

      Part III: Overconfidence: Other reasons System 1 makes mistakes

      • Illusion of understanding
        • The mind creates an illusion of understanding by believing WYSIATI
        • Hindsight bias creates the illusion that outcomes were obvious and that decisions were obvious
        • Outcome bias affects the perception of decisions based on the results
        • Halo effect affects the perception of human decisions based on organization outcomes
      • Illusion of validity: A cognitive illusion
        • The illusion of skill/validity
        • Supported by a powerful professional culture
        • Hedgehogs and foxes: hedgehogs fit events to a single framework and predict based on that
        • Media favors appearance of hedgehogs in debates
      • Intuition vs Formulas
        • System 1 is influenced by several factors (priming etc. above)
        • The result is that statistical prediction will generally  outperform human expert prediction (Meehl, Clinical vs. Statistical prediction)
        • Humans tend to try to think outside the box, adding
        • When predictability is poor, inconsistency (generated by System 1) destroy predictive validity
        • Broken leg rule: Occurrence of outlier events impacts prediction
        • Combining predictors (averaging them) is better than a linear multiple regression algorithm
      • When can we trust expert intuition
        • Other school of thought: Neural Decision Making: Seeks to understand how intuition works (Gary Klein, Sources of Power)
        • Intuition : System 1 implements rapid pattern recognition with System 2 executing a deliberate process to make sure that the decision will work
        • Requirements:
          • An environment that is regular enought to be predictable
          • Prolonged practice at identifying the  regularities
          • E.g Chess players an rapidly and intuitively recognize a situation as weak or strong, but this needs approx 6 years of practice at 5 hrs/day
      • The outside view
        • Inside view vs. Outside view: Knowledge about an individual case makes an insider feel no need for the statistics of the case
        • Exhibited as a belief in the uniqueness of the case
        • Planning fallacy: Unrealistically close to best case
      • The engine of capitalism
        • Irrational optimism: Optimistic bias plays a dominant role in risk taking
        • Overconfidence in ones own forecast: An effect of System 1 and WYSIATI
        • Remedy: Prepare a premortem for all decisions: Assume that decisions made, result in a disaster. Write a postmortem

      Part IV: Choice: What influences human choice

      • Bernoulli's errors
        • Humans vs. Econs
          • Econs: Rational, Selfish, Maximize utility, Tastes do not change
        • Utility theory (Bernoulli)
          • Prior to Bernoulli, outcomes of gambles were compared based on outcomes (expected values)
          • Bernoulli realized that people dislike risk and this was explained by diminishing marginal value of wealth
          • Assigned a utility to each value of wealth, though the increase in utility decreased as wealth increasing
          • Diminishing returns
          • Explains insurance: Risk is transferred from poor person (with higher loss of utility) to a richer person (lower loss of utility)
      • Prospect theory:
        • Utility theory has a flaw: Utility is not absolute, it depends on the reference point
        • Difference is utility can differ based on direct: Loss of $500 has greater neg utility that a gain of $500
        • Depends on increase/decrease: E.g $5M has a different utility if it is considered in the context of an increase from 1M to 5M or a decrease from $10M to $5M
        • Taking this into account, will result in different predictions for how willing a poor or rich person is willing to take risk
        • Conclusion: If all options are bad, people tend to prefer gambling/risk taking, else they  avoid risk
        • Prospect theory
          • How financial decisions are made:
          • Evaluation compare to a reference point: status quo
          • Diminishing sensitivity  to evation of changes
          • Loss aversion
          • Gain/loss vs. Psychological utility is an S curve, but not a symmetric curve
          • Problems: Does not account for regret,disappointment
      • Endowment effect
        • Decisions are impacted by whether a good is meant for exchange or for use
        • Psychological value of a good for use, such as a mug or an already possessed good can change the utility of selling it
      • Bad events
        • Loss aversion is with respect to a reference point
        • Not achieving a goal may be a loss, exceeding a goal may be a gain
        • Impacts negotiations, where parties fight harder to avoid losses than to make gains
        • In a negotiation both parties feel they have lost more than gained
        • The asymmetry between feeling of gain/loss impacts the feeling of fairness: Can impact  whether  customer choose to buy products whose prices have risen
        • Fairness: It is considered unfair to impose losses on a customer, relative to his reference point
        • Reference points cause a sense of entitlement
      • Fourfold pattern
        • Outweighing of Small probability events
        • Decision weights are not identical to probability weights
        • =>Expectations (weighing by probability) is flawed
        • Decisions are made based on decision weights not probabilities
        • Decisions weight = probability, p=0 and p=1, but d ne p for all other value (d <p or d>p depending on d)
        • p=0 is close to possibility and p=100 is close to certainty
        • Fourfold pattern: Gain/Loss vs. High/Low probability
        • The fourfold pattern shows how high/low probability of a gain or loss results in  acceptance/rejection of unfavorable/favorable outcomes in negotiations because of the  aversion to loss/hope of gain and consequent risk taking/aversion
      • Rare events
        • People overestimate probabilities of unlikely event
        • People overweight unlikely events
        • Vivid or alternative descriptions of events influence decision weights (1 in 1000 vs. 0.1%)
      • Risk policies
        • People tend to be risk averse in gains and risk taking in losses
        • Broad framing (the grouping of several decision problems into a single problem) can result in better decisions than narrow framing (separately deciding each problem).
        • Samuelson's problem: Aversion to a single gamble vs expected value of several hundred instances of the gamble
        • Since a life will consist of several such small gambles, it pays to take the small gambles
          • Gambles must be independent experiments
          • Gambles must not be excessive
          • Gambles must not be long shots
        • Loss aversion + narrow framing less to bad (risk averse) decision
        • E.g. individual managers are risk averse because they take individual decisions. A CEO frames the decisions broadly, and favors taking a risk, in the hope that statistically one of them will pay off
      • Keeping score
        • Disposition effect: A product of narrow framing: E.g the tendency to sell winning stock in preference to losing stock, because of the pain caused by acknowledging and closing a losing stock.
        • Sunk cost fallacy: Tendency to throw good money at a bad project in the hope of salvaging it
        • Regret/blame: People have strong reactions to an outcome produced by action, than to an outcome produced by inaction (regret)
        • There is an aversion to trading increased risk for any other advantage, even if the advantage is significantly more gainful than the risk
        • Regret/hindsight bias cause regretful feelings when moderate amount of though has gone into decisions
        • Think deeply and anticipate regret, or think little.
      • Reversals
        • Preference reversals: Preference can change when two choices are compared jointly vs. if they are presented singly
        • Frames and Reality
        • Losses cause stronger negative feelings than cost
        • Framing a decision can impact decisions: gallons per mile vs. miles per gallon

      Part V: Two selves: How memories are assessed

      • Two selves
        • Experienced utility vs. Decision utility
        • Experiencing self vs. Remembering self
        • Experience expresses satisfaction of the whole experience, while remembering may only remember selected parts of the whole experience
        • Peak end rule: Intense events towards the end of an experience are remembered
        • Duration neglect: Durations of experiences are often forgotten while intensity is not
      • Life as a story
        • Duration neglect, peak end rule and the remembering self impact decisions
      • Experienced well being/Thinking about life
        • Measures of happiness  reflect the remembering self not the experienced self
        • Affective forecasting: The effect of recent significant memories on opinion
        • Focusing illusion: Nothing is as important as you think when you are thinking about it
      • Conclusions
        • System1/System2, Econs/Humans, Experiencing self/Remembering self