Activities in each layer need to be non-linear function of activities in layer below in order to "compute a series of transformations that change the similarities between cases"
RNNs
Directed cycles
"More biologically realistic"
Equivalent to deep (feed-forward) nets with one hidden layer per time slice
Except: use same weight at each time slice, get input at every time slice
Symmetrically connected nets
Like RNNs, but connections have weights in both directions
Hopfield realized they're easier to analyze than RNNs (because they're more limited in ability; "obey an energy function")
Can't model cycles
Without hidden units, one is called a Hopfield Net
Lec 2.2 - Perceptrons
Fell into disfavor because Minsky and Papert showed they were limited
Popularized in 1960s by Frank Rosenblatt; Principles of Neurodynamics. Described many kinds of perceptrons.
Still used today for tasks with big feature vectors (e.g. millions)
(5:15) Decision unit is binary threshold neuron
Weighted sum of inputs from other neurons, plus bias. Output 1 if weighted sum exceeds 0.