Inferring the MYC GRN – Andreas Tjärnberg – The Data Properties



Inferring the MYC GRN – Andreas Tjärnberg – The Data Properties

0 0


Inferring-MYC-GRN-presentation

Progress report of the MYC inferrence project

On Github Xparx / Inferring-MYC-GRN-presentation

Inferring the MYC GRN

Andreas Tjärnberg

andreas.tjarnberg@scilifelab.se

Sonnhammer group meeting; 2015-03-26

Why?

Specific

  • Myc is a very important cancer-related transcription factor.
  • Thousands of genes have been shown to be targets of Myc

General

  • Gene Regulatory networks gives insight in to information flows in living cells
  • Complex regulatory systems responses could be predicted, controlled and altered

What?

The Data Properties

  • siRNA knockdown data
  • 45 genes (40)
  • 3\(^*\) replicates (\(\cdot\) 2 technical) single gene knockdowns
  • single set of experiments with double knockdowns
  • SNR\(_v= 0.56\)
  • kondition number, \(\kappa \approx 70\)

The Data Properties

\[ \begin{array}{r c l} \text{SNR}_v &\equiv& \arg\min_i \frac{||\boldsymbol{y}_i||}{\sqrt{\chi^{-2}(\alpha,M)\lambda}}\\ \\ \\ \\ \\ \end{array} \]

The Data Properties

\[ \begin{array}{r c l} \text{SNR}_v &\equiv& \arg\min_i \frac{||\boldsymbol{y}_i||}{\sqrt{\chi^{-2}(\alpha,M)\lambda}}\\ \\ \\ \\ \kappa &\equiv& \frac{\sigma_1}{\sigma_N}\\ \end{array} \]

The Data Properties

\[ \begin{array}{r c l} \text{SNR}_v &\equiv& \arg\min_i \frac{||\boldsymbol{y}_i||}{\sqrt{\chi^{-2}(\alpha,M)\lambda}}\\ \\ \\ \\ \kappa &\equiv& \frac{\sigma_1}{\sigma_N}\\ \end{array} \] Sorry, your browser does not support SVG.

The Model

\[ \begin{array}{r c l} \dot{x}_i(t) &=& \sum_{j=1}^N a_{ij}x_j(t) + p_i(t) - f_i(t)\\ \\ \\ \\ \end{array} \] Sorry, your browser does not support SVG.

The Model

\[ \begin{array}{r c l} \dot{x}_i(t) &=& \sum_{j=1}^N a_{ij}x_j(t) + p_i(t) - f_i(t)\\ \\ \\ y_i(t) &=& x_i(t) + e_i(t)\\ \end{array} \] Sorry, your browser does not support SVG.

How?

Model fitting

\[ \boldsymbol{Y} - \boldsymbol{E} = -\boldsymbol{A}^{-1}(\boldsymbol{P} - \boldsymbol{F}) \]

Model fitting

\[ \boldsymbol{Y} - \boldsymbol{E} = -\boldsymbol{A}^{-1}(\boldsymbol{P} - \boldsymbol{F}) \] Sorry, your browser does not support SVG.

Model fitting

\[ \begin{array}{r c l} \boldsymbol{Y} - \boldsymbol{E} &=& -\boldsymbol{A}^{-1}\boldsymbol{P}\\ &\Downarrow&\\ \boldsymbol{\hat{Y}} &=& -\boldsymbol{A}^{-1}\boldsymbol{P}\\ &\Downarrow&\\ \boldsymbol{A}\boldsymbol{\hat{Y}} &=& -\boldsymbol{P}\\ &\Downarrow&\\ \boldsymbol{A}\boldsymbol{\hat{Y}} + \boldsymbol{P} &\approx& \boldsymbol{0}\\ \end{array} \]

Model fitting

Least squares

\[ \begin{array} \boldsymbol{A_{ls}} = \arg\min_{\boldsymbol{A}} || \boldsymbol{A}\boldsymbol{\hat{Y}} + \boldsymbol{P}||_{\ell_2}\\ \text{subject to:}~~ \boldsymbol{A}\boldsymbol{\hat{Y}} = -\boldsymbol{P}\\ \end{array} \]

LASSO

\[ \boldsymbol{A_{\zeta}} = \arg\min_{\boldsymbol{A}} || \boldsymbol{A}\boldsymbol{\hat{Y}} + \boldsymbol{P}||_{\ell_2} + \zeta||\boldsymbol{A}||_{\ell_1} \]

Bootstrap Regularization

  • Statistical method for estimating statistics
  • Applied to the LASSO formulation by Bach 2008

Bootstrap Regularization

Procedure

Sample uniformly from data \([\boldsymbol{\hat{Y}}, \boldsymbol{P}]\) a new data set \([\boldsymbol{\hat{Y}_{bs}}, \boldsymbol{P_{bs}}]\) Apply LASSO on the new data set so that \[ \boldsymbol{A_{reg}}(\zeta) = \arg\min_{\boldsymbol{A}} || \boldsymbol{A}\boldsymbol{\hat{Y}_{bs}} + \boldsymbol{P_{bs}}||_{\ell_2} + \zeta||\boldsymbol{A}||_{\ell_1} \] Repeat for \(n\) number of bootstraps. A link \(s_{ij} \in \boldsymbol{A_{bs}}\) has bootstrap support \(s_{ij} = \frac{e_{ij}}{n}\cdot 100\%\) where \(e_{ij}\) is the times we have seen \(a_{ij}\) in \(\boldsymbol{A_{reg}}(\zeta)\)

Bootstrap Regularization

Sorry, your browser does not support SVG.
Sorry, your browser does not support SVG.
Sorry, your browser does not support SVG.

Prediction

\[ \begin{array}{c c c c} % & {\Large{\boldsymbol{\hat{Y}}}} & & {\Large{\boldsymbol{Y}}}\\ \text{Predicted} & & \text{Measured} & \\ {\Large{\text{RSS}}} = & \begin{vmatrix} \begin{pmatrix} \begin{bmatrix} \hat{y}_{1,1} & \hat{y}_{1,2} & \cdots & \hat{y}_{1,n} \\ \hat{y}_{2,1} & \hat{y}_{2,2} & \cdots & \hat{y}_{2,n} \\ \vdots & \vdots & \ddots & \vdots \\ \hat{y}_{m,1} & \hat{y}_{m,2} & \cdots & \hat{y}_{m,n} \end{bmatrix} & - & \begin{bmatrix} y_{1,1} & y_{1,2} & \cdots & y_{1,n} \\ y_{2,1} & y_{2,2} & \cdots & y_{2,n} \\ \vdots & \vdots & \ddots & \vdots \\ y_{m,1} & y_{m,2} & \cdots & y_{m,n} \end{bmatrix} \end{pmatrix}^{.2} \end{vmatrix} \\ \end{array} \]

Prediction

\[ \begin{array}{c c c c} % & {\Large{\boldsymbol{\hat{Y}}}} & & {\Large{\boldsymbol{Y}}}\\ \text{Predicted} & & \text{Measured} & \\ {\Large{\text{RSS}}} = & \begin{vmatrix} \begin{pmatrix} \begin{bmatrix} \hat{y}_{1,1} & \hat{y}_{1,2} & \cdots & \hat{y}_{1,n} \\ \hat{y}_{2,1} & \hat{y}_{2,2} & \cdots & \hat{y}_{2,n} \\ \vdots & \vdots & \ddots & \vdots \\ \hat{y}_{m,1} & \hat{y}_{m,2} & \cdots & \hat{y}_{m,n} \end{bmatrix} & - & \begin{bmatrix} y_{1,1} & y_{1,2} & \cdots & y_{1,n} \\ y_{2,1} & y_{2,2} & \cdots & y_{2,n} \\ \vdots & \vdots & \ddots & \vdots \\ y_{m,1} & y_{m,2} & \cdots & y_{m,n} \end{bmatrix} \end{pmatrix}^{.2} \end{vmatrix} \\ \end{array} \]
\[ {\large{\text{RSS} \sim \chi^2(mn)}} \]

Prediction

\[ \begin{array}{c c c c} % & {\Large{\boldsymbol{\hat{Y}}}} & & {\Large{\boldsymbol{Y}}}\\ \text{Predicted} & & \text{Measured} & \\ {\Large{\text{RSS}}} = & \begin{vmatrix} \begin{pmatrix} \begin{bmatrix} \hat{y}_{1,1} & \hat{y}_{1,2} & \cdots & \hat{y}_{1,n} \\ \hat{y}_{2,1} & \hat{y}_{2,2} & \cdots & \hat{y}_{2,n} \\ \vdots & \vdots & \ddots & \vdots \\ \hat{y}_{m,1} & \hat{y}_{m,2} & \cdots & \hat{y}_{m,n} \end{bmatrix} & - & \begin{bmatrix} y_{1,1} & y_{1,2} & \cdots & y_{1,n} \\ y_{2,1} & y_{2,2} & \cdots & y_{2,n} \\ \vdots & \vdots & \ddots & \vdots \\ y_{m,1} & y_{m,2} & \cdots & y_{m,n} \end{bmatrix} \end{pmatrix}^{.2} \end{vmatrix} \\ \end{array} \]
\[ {\large{\text{RSS} \sim \chi^2(mn)}} \] Sorry, your browser does not support SVG.

Results

Sorry, your browser does not support SVG.

When?

Errors in variables

Total Least Squares (TLS)

Remember:

Error: Embedded data could not be displayed.

Errors in variables

Total Least Squares (TLS)

Remember:

Error: Embedded data could not be displayed.

Errors in variables

Total Least Squares (TLS)

\[ \boldsymbol{A}(\boldsymbol{Y} - \boldsymbol{E}) = -(\boldsymbol{P} - \boldsymbol{F}) \]

Errors in variables

Total Least Squares (TLS)

\[ \boldsymbol{A}(\boldsymbol{Y} - \boldsymbol{E}) = -(\boldsymbol{P} - \boldsymbol{F}) \]

Errors in variables

Total Least Squares (TLS)

\[ \begin{array} \boldsymbol{A_{tls}} = \arg\min_{\boldsymbol{A}} || [(\boldsymbol{A}\boldsymbol{\hat{Y}} + \boldsymbol{\hat{P}})~ (\boldsymbol{\hat{Y}} + \boldsymbol{A}^{-1}\boldsymbol{\hat{P}})]||_{\ell_2}\\ \text{subject to:}~~ \boldsymbol{A}\boldsymbol{\hat{Y}} = -\boldsymbol{\hat{P}}\\ \end{array} \]

Markovsky and Van Huffel 2007

Errors in variables

Total Least Squares (TLS)

\[ \begin{array} \boldsymbol{A_{tls}} = \arg\min_{\boldsymbol{A}} || [(\boldsymbol{A}\boldsymbol{\hat{Y}} + \boldsymbol{\hat{P}})~ (\boldsymbol{\hat{Y}} + \boldsymbol{A}^{-1}\boldsymbol{\hat{P}})]||_{\ell_2}\\ \text{subject to:}~~ \boldsymbol{A}\boldsymbol{\hat{Y}} = -\boldsymbol{\hat{P}}\\ \end{array} \]

Markovsky and Van Huffel 2007

  • To incorporate in pipeline, we need a structure constraint TLS algorithm (fast)
  • A LASSO with TLS would also be nice.

Not implemented yet!

Thanks