On Github TomaszGolan / rochester_mlvf_reveal
epoch = one loop over the whole training sample
for each feature vector weights are updated using gradient descent method
target: \(y = 0, 1\)
not really efficient for classification
imagine having some data ~ 100
Logistic function: \[g(z) = \frac{1}{1 + e^{-z}}\]
Probability of 1: \[P (y = 1 | x, w) = h(x)\]
Probability of 0: \[P (y = 0 | x, w) = 1 - h(x)\]
Probability: \[p (y | x, w) = (h(x))^y\cdot(1 - h(x))^{1 - y}\]
Likelihood: \[L(w) = \prod\limits_{i=0}^n p(y^{(i)} | x^{(i)}, w) = \prod\limits_{i=0}^n (h(x^{(i)}))^{y^{(i)}}\cdot(1 - h(x^{(i)}))^{1 - y^{(i)}}\]
Log-likelihood: \[l(w) = \log L(w) = \sum\limits_{i=0}^n y^{(i)}\log h(x^{(i)}) + (1 - y^{(i)})\log (1-h(x^{(i)}))\]
Learning step (maximize \(l(w)\)): \[w_j = w_j + \alpha\frac{\partial l(w)}{\partial w_j} = w_j + \alpha\sum\limits_{i=0}^n\left(y^{(i)} - h (x^{(i)})\right)x_j\]
We can do classification
We can do regression
But real problems are nonlinear
Feature vector: \[(x,y) \rightarrow (x,y,x^2,y^2)\]
Hypothesis: \[h (x) = \frac{1}{1 + e^{-w_0 - w_1x - w_2y - w_3x^2 - w_4y^2}}\]
\[h(x) = \frac{1}{1 + e^{-w^Tx}}\]
Intuition:
x XOR y = (x AND NOT y) OR (y AND NOT x)
src: deeplearning.net
src: wildml.com
src: arxiv
src: wildml.com
The first goal is to use CNN to find vertex in nuclear target region
Classification: upstream of target 1, target 1, plastic between target 1 and target 2, target 2...
Regression: in progress, no luck so far
Apply this for Marianette's DIS measurement
Next steps: NC\(\pi^0\)? \(\pi\) momentum? hadron multiplicities?
started with smaller samples to save GPU time and memory usage
working on "z-extension"
At this point we use me1B to train the net
and me1A to test the net