Skip to content

Web server

yaowen edited this page Dec 16, 2021 · 6 revisions

MetaLogo WebServer (v1.1.0)

MetaLogo provides a public webserver for users. Users can also deploy their own MetaLogo server in their local network. In this tutorial, we will explain how to use MetaLogo webserver without coding.

First you need to visit http://metalogo.omicsnet.org (could be http://localhost:8050 if you deploy MetaLogo in your own computer), then you will see the following page if nothing goes wrong.

server_top

Just as the top menu indicated, you could visit the analysis tab, results tab, tutorial, python package, Journal paper, our Lab website and email us from the top menu.

To start analysis, you can click the Analysis buttons on the page.

Analysis page

On the analysis page, there are four panels for users to custom their analysis. For basic use of MetaLogo, you probably only need to focus on the first panel. More advanced usage requires adjusting the parameters in the last three panels. In this tutorial, we will introduce all the four panels about their parameters and usage. Please note that there are four Submit buttons on the page, they are all the same. Feel free to click any of them after you finish your settings.

Step1. Input Panel

input_panel

The figure above shows the details of the first panel. Red numbers are placed to indicate funtional items of the panel, which we will explain in the following.

  1. Input Format

    For Input Format, you could choose from Fasta and Fastq formats.

  2. Sequence Type

    For Sequence Type, you could choose from Auto, DNA, RNA or protein.

    The term Auto means MetaLogo will automatically detect the sequence type. Note that if you choose the right sequence type, you need to make sure that your input sequences only involve in valid bases.

  3. Basic analysis

    For Basic analysis, you could choose from Yes and No.

    MetaLogo could perform basic analysis on your input data, such as length distribution, group counts, entropy distribution, group clustering heatmap, sequence pairwise distance distribution, and so on. If you choose No for Baisc analysis, MetaLogo will ignore this analysis and only output the sequence logo.

  4. Grouping By

    For Grouping By, you can choose from Auto, Length, and Seq Identifier.

    This parameter means how do you want MetaLogo to group your sequences to reveal the heterogeneity. If you choose Auto, MetaLogo will perform Multiple sequnce alignment on your sequences, build phylogenetic tree, and grouping sequences based on the tree. If you choose Length, MetaLogo will group sequences you input according to their lengths; if you choose Seq identifier, MetaLogo will identify the group information of each sequence by checking their sequence name to find a patter like group@number-name. Below is a valid example:

     >seq1 group@1-firstgroup
     ATACAGACAGAGACACAGGGTTCG
     >seq2 group@1-firstgroup
     ATACAGACAGAGACACAGGGGTTC
     >seq3 group@2-secondgroup
     ATACAGACAGAGACACAG
     >seq4 group@2-secondgroup
     ATACCCCCAGAGACACAG
    

    Please note that in Seq Identifier grouping mode, sequence lengths should be the same in a group.

  5. Clustering Method

    For Clustering Method, you can choose from Max, Max_clade,Single_linkage.

    This parameter determine how do TreeCluster cluster your sequences based on the tree. For more details, please check https://github.com/niemasd/TreeCluster.

  6. Grouping Resolution

    For Grouping Resolution, you can set any value between 0 to 1.

    Since the clustering process are based on the distances among sequences on the phylogenetic tree, a distance threshold is needed for the clustering process by TreeCluster. We provide a resolution value to control the distance threshold. The bigger the resolution, the bigger the threshold. But resolution is limitted to [0,1]. Resolution 0 means MetaLogo will treat each sequence as a single group, while resolution 1 means MetaLogo will treat all the sequences as one group. Simply, if you want to get more groups, please set a lower resolution; if you want to get less groups, set a higher resolution. If a sequence is not clustered into any cluster, or this sequence itself is a group, MetaLogo will consider it as a singleton and assign it to the -1 group. That means resolution 0 will also lead to a single group sequence logo, since all the sequences are assigned into -1 group.

  7. Minimum Length and Maximum Length

    These two parameters specify which sequences with certain lengths to be included for making sequence logo.

  8. Please refer to 7.

  9. Display Range (left) and Display Range (right)

    These two parameters specify which region for displaying sequence logos. The range starts from 0 and ends to -1. -1 means the end of sequence, in a Python style. Note that only if the sequence logo are not aligned, this parameters will not work.

  10. Please refer to 9.

  11. Group Limit

    This parameters specify how many groups you want to keep on the final sequence logos figure, in case that too many groups are produced during the clustering process to make it difficult to clearly present these logos on the final figure. MetaLogo will keep groups according the number of sequences in them. MetaLogo consider the first sequence of the input as the target sequence to track its grouping, so MetaLogo will keep the group into which the target sequence is assigned, no matter what value you set for Group Limit.

  12. examples

    By clicking the three example buttons, you can load these example dataset into the input area. Just click the submit button to peform the analysis. One set contains sequences of E. coli transcription factor binding sites, the second set contains sequences of CDR3s of verified antibodies detected in BCR repertoires of individuals with COVID-19, the third set contains sequences of another experiment validated BCR clonotypes from patient with COVID-19.

  13. Input area

    Please paste your sequences here. Please note the first sequence of your input will be treated as the target sequence. MetaLogo will track the grouping position of the target sequence.

  14. Upload file

    You can also upload a file instead of pasting it to the textarea. Please note a file with >5MB size will not be uploaded.

Step2. Logo alignment Algorithm Panel

algorithm_panel

In this panel, users can specify the algorithms to make logos and logo alignment. Before specify parameters in this section, users need to check the Alignment tutorial first.

  1. Height Algorithm

    For Height Algorithm, user can choose Bits, Bits without correction, Probabilities.

    Bits is used in most sequence logo generators, which represent the total height of the letters depicts the information content of the position. please check the details at https://en.wikipedia.org/wiki/Sequence_logo. If you choose Probabilities, the height of each letter will represent the propotion of that letter in letters of that position. Note if you chooose Bits, a small-sample correction will be performed. If a group contain too few sequences, then the group will get a all-zero bits array. Choose Bits without correction or Probabilities to avoid this if you only have few sequences as input.

  2. Adjacent Alignment

    If you choose Yes for Ajdacent Alignment, MetaLogo will try to link conserved positions between adjacent groups on the final figure.

  3. Global Alignment with padding

    This parameter will not work under auto-grouping scenario. When user speficy the grouping strategy, MetaLogo can align sequence logos from different groups by a adjusted MSA algorithm. If you set Yes for Global Alignment with padding, paddings will be insert to align different sequence logos. Note that in auto-grouping scenario, all sequences have been aligned in the MSA process before drawing the sequence logos.

  4. Score Metric

    For Score Metric, user can choose from Dot Production, Jensen Shannon, Cosine and Entropy weighted Bhattacharyya Coefficient. For explanations please check the Alignment tutorial first.

  5. Gap Penalty

    This is only for sequence logo alignment under user-defined grouping scenario. For Gap Penalty, user can specify the gap penalty in the logo alignment process. If you set a small penalty, like 0, MetaLogo will insert gaps as much as possible. In contrast, if you set a large penalty, like -10, MetaLogo will hate gaps and avoid them as much as possible. Below is a explained figure.

    gap

  6. Connect Threshold

    For Connect Threshold, the default value is -0.2. This threshold will not affect the alignment process, but it guide MetaLogo to choose and connect certain pairs of aligned positions from two adjacent logos according their alignment. If this threshold is positive (>0), MetaLogo will connect positions with a similarity larger than the threshold. If this threshold is negative (<0), MetaLogo will connect the top (ratio*100)% of all paris of aligned positions between two adjacent logos ranked by similarities, where ratio equals -1*threshold. Below is a explained example.

    connect_threshold

Step3. Layout Panel

layout_panel

In this panel, users can choose different layout for their logo groups.

  1. Logo Shape

    There are in total four different layouts, including Horizontal, Circular, Radial and 3D. Below is a collection:

    layouts

    Please note that horizontal shape should be choosen most of the time. Because only logos with horizontal shapes can be connected to a clustering tree.

  2. Connect Tree

  3. Sort By

    For Sort By, users can choose from Auto, Length, Length_reverse, Group Id, Group ID reverse.

    This parameter specify the group order of logos. For auto-grouping scenario, Auto will be automatically choosen forLength means MetaLogo will sort logos by sequence lengths; Length_reverse means MetaLogo will sort logos by sequence lengths but in a descending manner; Group Id means MetaLogo will sort logos by group names indicated by sequence names; Group Id reverse leads to a descending manner.

  4. Logo Margin Ratio

    For Logo Margin Ratio, Column Margin Ratio and Character Margin Ratio, they specify the relative distances between items. These parameters all represent relative distances, if you set Logo Margin Ratio to 0.1, you will get a 1cm distance between logos when the logo heigth is 10cm. Below is a explained example.

    margin_explained

    margin_example

Step4. Output Style Panel

In this panel, you could specify lots of parameters to customize your MetaLogo sequence logos. Most of them are easy to understand.

style_panel

  1. Auto figure size

    If you set this parameter to Yes, MetaLogo will automatically determine the figure size of the output. However, you can still set your own size.

  2. Alignment Color

    For Alignment Color, users could choose the color for MetaLogo to connect or highlight similarly aligned positions.

  3. Alignment Transparency

    This parameter specifies the transparency of these connects.

  4. Color scheme

    For Color Scheme, users could choose the built-in scheme like DNA Basic, RNA Basic and Protein Basic. Users can also customize their own color scheme by assigning each base a color through our color picker widgets. RGB, HSL and HEX are supported for color picker.

    color_picker

    Note that all color settings are stored in the local storage of your web browser, so you can feel free to refresh MetaLogo web without worried about losing your carefully chosen color schemes.

  5. Download Format

    For Download Format, MetaLogo provides PNG, PDF, PS, EPS and SVG formats for users to download.

Result Page

After you submit the job, the page will re-direct to a new result page as follows.

Waiting message

result_page

MetaLogo server assigns each task a unique uid. When you click submit button, you will be redirected to the result page. The page will refresh itself regularly to get the updated message from the server.

  1. Uid

    This indicates the uid of the current uid.

  2. Link

    This link is the unique link of the current task. Users can save this link and check it later in the following 7 days.

  3. refresh time

    The page will be refreshed every 10 seconds in the first minute, then every 1 minute, then every 10 minute, which depends on the computational load of your task.

Results content

After the task is completed, MetaLogo server will present all the results on the result pages. There are several panels to display different type of results, we will explain them in the following.

Task info and parameters

This panel will show the basic information of the current task, including the id, title, created time and other detailed parameters.

task_info

Sequence Logos

This panel shows the sequence logos as well as a re-run module.

logo_panel

  1. Resolution value

    This value shows the resolution value of the current figure. Note if the grouping is not automatically determined by MetaLogo, no message will be displayed on the head of the figure.

  2. Group limit

    This value shows how many groups are displayed on the plot out of the total groups. Please refer to the analysis page. Note that if a group only contain few sequences, the group could get a all-zero bits array, and MetaLogo will not show that group on the figure. If you choose probabilities or bits_without_correction as the height algorithm, this will not be a problem.

  3. Display range

    This shows the displayed region of the aligned alignment.

  4. Group Id

    These values indicate the group id of each group. These ids are generated by TreeCluster. The group -1 contains all the singleton sequences.

  5. Bits

    Each logo has its own bits limits from 0 to the highest bits in the group. Note a tick could be the top of one logo but also be the bottom of another logo, so labels could be like 4.57/0, in which 4.57 represents the highest bits of the lower logo and 0 represents the bottom of the upper logo.

  6. Conserved motifs

    Conserved motifs are connected by colored strands. These strands could be controlled by choose an algorithm and a threshold, please refer to the instructions for the analysis page.

  7. Gaps

    These blans means gaps in the multiple sequence alignment or multiple logo alignment.

  8. Target sequence mark

    The red dot on the figure indicate the group to which the target belongs. Remember that MetaLogo takes the first sequence of the input as the target sequence to track.

  9. Hierarchical clustering

    The tree on the left show the hierarchical clustering result of the groups. For each group, the bits representation vector or probabilities vector is used to cluster. Note that this tree is not the phylogenetic tree. The phylogenetic tree of sequences is also provided on the result page and will be explained later.

Statistics Analysis

If you choose to perform basic analysis on the analysis page, MetaLogo will provide you several statistical figures on the result page.

Figure 0. Sequence lengths distribution.

figure0

Figure 1. Sequence counts of each group.

figure1

Figure 2. Entropies of each position. ("X"s mean gaps)

figure2

Each box represents the entropy of the corresponding position. The higher the entropy, the less conserved the position.

Figure 3. Entropies distribution of each group.

figure3

Figure 4. Correlations among groups (only in global alignment mode and #groups>1). (Only for global alignment or auto-grouping mode)

figure4

The dendrogram on the figure is the same as that on the left of the sequence logos.

Figure 5. Distribution of pairwise distances of nodes in the phylogenetic tree. (Only for auto-grouping mode)

figure5

This figure can be used to guide the selection of the clustering resolution.

Other Analysis

other_results

MetaLogo server also provides the MSA and phylogenetic tree visualiation. Note this is based on all the sequences, rather than the groups.

msa

tree


Download files

MetLogo provides the result files and the intermediate results.

download

  1. Config File

    This file contains all the parameters MetaLogo needed to replicate the current task. You can save the config for latter use. The file can also be used together with the MetaLogo python package.

  2. Sequence Input

    This file is the same file user uploaded to the server as the input.

  3. Sequence Logo

    This figure is the main sequence logo figure. Users can choose the prefered format (PNG, PDF, SVG, etc.) on the analysis page and download it here.

  4. MSA result

    This file is the fasta format of the msa result.

  5. Phylogenetic Tree

    This file is the newick format of the phylogenetic tree of sequences.

  6. Grouping details

    This file contains the grouping information of all the sequences. Note only sequences who satisfied the limitation of the parameters on the analysis page can be kept here.