Skip to content
This repository has been archived by the owner on Sep 24, 2019. It is now read-only.

Abstract Syntax Tree for BEL Terms

Anthony Bargnesi edited this page Aug 31, 2015 · 9 revisions

Description

A parser exists that will recognize different types of BEL expressions. The types that are currently recognized are:

  • value (e.g. AKT1)
  • namespaced value (e.g. HGNC:AKT1)
  • term (e.g. p(AKT1), p(HGNC:AKT1), tscript(p(HGNC:AKT1, pmod(S,Y,694)))

Each parsed token indicates if it was deemed complete and has a character position interval [start, end) (i.e.left-closed, right-open).

For example the recognition of p(HGNC:AKT1 would produce the following tree:

TERM[0](-1, -1]
  fx(p)(0, 1]
  ARG[1]
    NV[1](2, 11]
      pfx(HGNC)(2, 6]
      val(AKT1)(7, 11]
    ARG[0]
      (null)
      (null)

With this example notice that:

  • An incomplete TERM token was recognized with indeterminate character positions.
  • A complete argument (ARG) was recognized that happened to be a Namespace Value (NV) token. It's character position interval was (2, 11].
  • An argument (ARG) exists with two NULL children. This leaf node signifies the exclusive end of the argument list. Additionally by having this right child it preserves the structure of a binary tree (parent nodes always have only two children).

Getting Started

Install bel.rb from the term_ast branch using:

git clone [email protected]:OpenBEL/bel.rb.git
cd bel.rb
git checkout -b term_ast
gem build bel.gemspec
gem install bel-0.3.3.gem

Open up an irb (or pry) session and try:

require 'bel'

# Returns an Abstract Syntax Tree that you can traverse.
BEL::Parser.parse('AKT1')
=> #<BEL::LibBEL::BelAst:0x000000035adea0>
# You can also print to a flattened string for debugging.
BEL::LibBEL.bel_print_ast(BEL::Parser.parse('AKT1'))
=> ARG[1] NV[1][0, 4] pfx((null)) val(AKT1)[0, 4] ARG[0] (null) (null)

# Structured as a tree it would be:
# ARG[1]
#   NV[1][0, 4]
#     pfx((null))
#     val(AKT1)[0, 4]
#   ARG[0]
#     (null)
#     (null)

Screen recording showing examples of expression parsing with BEL::Parser.

Design

API

Use Cases

Visual Examples

Simple protein term:

p(HGNC:AKT1)
           TERM                        
           +  +                        
           |  |                        
    .------+  +------.                 
    |                |                 
  fx(p)             ARG                
                    + +                
                    | |                
           .--------+ +----------.     
           |                     |     
          NV                    ARG    
          ++                    + +    
          ||                    | |    
    .-----++------.          .--+ +--. 
    |             |          |       | 
pfx(HGNC)     val(AKT1)     NULL    NULL

Modified protein term:

p(SFAM:"STAT5 Family",pmod(P,Y,694))
           TERM                        
           +  +                        
           |  |                        
    .------+  +------.                 
    |                |                 
  fx(p)             ARG                
                    + +                
                    | |                
           .--------+ +----------------.     
           |                           |     
          NV                          ARG    
          ++                          + +    
          ||                          | |    
    .-----++------.                .--+ +-----------. 
    |             |                |                | 
pfx(SFAM)     val(STAT5 Family)   TERM             ARG
                                  +  +             + +
                                  |  |             | |
                               .--+  +--.       .--+ +--.
                               |        |       |       |
                             fx(pmod)  ARG     NULL    NULL
                                       + +
                                       | |
                        .--------------+ +--.
                        |                   |
                       NV                  ARG
                       ++                  + +
                       ||                  | |
                    .--++--.            .--+ +---------.
                    |      |            |              |
                pfx(NULL)  P           NV             ARG
                                       ++             + +
                                       ||             | |
                                    .--++--.       .--+ +-------.
                                    |      |       |            |
                                pfx(NULL)  Y      NV           ARG
                                                  ++           + +
                                                  ||           | |
                                               .--++--.     .--+ +--.
                                               |      |     |       |
                                           pfx(NULL) 694   NULL    NULL