Protein Chemical Shift Ranges from PIQC

We have performed a statistical analysis of the chemical shifts in the PACSY database, ¹ which contains >3000 proteins with 3D structures. After removal of misreferenced and misassigned data we have determined refined (multidimensional) chemical shift ranges for intra-residue correlations (¹³C–¹³C, ¹⁵N–¹³C, etc.). These chemical shift ranges can be used to gain amino-acid type-assignment and/or secondary-structure information from experimental NMR spectra.²

This page provides access to the Python tools we built for analyzing the PACSY database as well as some of the more useful derived data in tabulated form. Based on the purged data, we also provide two command-line programs PLUQin and SQAT. PLUQin can be used to help type assign experimental data. SQAT enables a quick assessment of the quality of assigned chemical shift data when secondary structure information is available.

References

Lee, W.; Yu, W.; Kim, S.; Chang, I.; Lee, W. "PACSY, a Relational Database Management System for Protein Structure and Chemical Shift Analysis." J Biomol NMR 2012, 54 (2),169–179.
K. J. Fritzsching, Mei Hong, K. Schmidt-Rohr. "Conformationally Selective Multidimensional Chemical Shift Ranges in Proteins from a PACSY Database Purged Using Intrinsic Quality Criteria " J. Biomol. NMR 2016 doi:10.1007/s10858-016-0013-5

Download

PLUQ-PLUQin-SQAT-PIQC Python 2.7 code. Instruction for use are below.

CSV formatted data tables with all chemical shift statistics from PIQC.

Install

The code has non-standard Python dependencies. If you are on a Mac or LINUX system with Python 2.X installed, install with:


            cd pluq
            pip install -r requirements.txt
            python setup.py install

It is a bad idea to use your default system Python. Please see `Pluqin_Install_Directions.txt` for an explanation of how to properly install Python.

For Programmers

The functionality of the code and provided scripts will be maintained and extended, but the API will be changed as needed/wanted. If you find bugs or would like to offer improvements please contact Keith (kfritzsc@brandeis.edu) .

PLUQin

Program to help assign protein chemical shifts peaks. Especially helpful for assigning 2D ¹³C-¹³C chemical shift correlations. Also provides secondary-structure information.

Examples:

Use -p for each peak you want to add. If the experiment is a 1D there should be 1 number after -p if the experiment is a 2D there should be two numbers. You can enter as many peaks as you would like. You can set the experiment with the option -e. The default is c (1D carbon). The joint probability cut-off can be set with the option -c. You can enter a negative value to see all options before taking the joint probability.


            $ pluqin.py -p 55.2 -p 18 -c 0
            input: [55.2], [18.0]
            experiment: c
            AA  p1    p2    p1    p2    Joint  H     C     E
            A   CA    CB    25.8  44.8  91.4   99.2  0.8   -
            M   CA    CE    3.3   8.0   8.6    7.7   42.4  49.9

Peaks positions from a 2D C-C experiment can be entered like:


            $ pluqin.py -p 55.2 18 -c 0 -e cc
            input: [55.2, 18.0]
            experiment: cc
            AA  p1            p1     Joint  H     C    E
            A   ('CA', 'CB')  100.0  100.0  92.9  6.9  0.1

Sequence information can be given with the option -s.


            $ pluqin.py -p 55.2 -p 18 -s MLFAMM -c 0 -e c
            input: [55.2], [18.0]
            experiment: c
            AA  p1  p2    p1    p2    Joint  H     C     E
            M   CA  CE    48.0  68.8  53.8   7.7   42.4  49.9
            A   CA  CB    30.3  31.2  46.2   99.2  0.8   -

Peaks from a 2D C-N experiment can be entered like:


            # remember only intra-residue peaks will work.
            $ pluqin.py -p 45 103 -e cn
            input: [45.0, 103.0]
            experiment: cn
            AA  p1           p1     Joint  H     C     E
            G   ('CA', 'N')  100.0  100.0  14.6  65.3  20.1

Sometimes PLUQin cannot make a definitive type assignment but still can provide secondary-structure information. Eg. all sheet (E) here


            $ pluqin.py -p 175 55 -p 55 35 -e cc
            input: [175.0, 55.0], [55.0, 35.0]
            experiment: cc
            AA  p1           p2            p1    p2    Joint  H     C     E
            K   ('C', 'CA')  ('CA', 'CB')  19.0  58.6  66.3   -     3.0   97.0
            R   ('C', 'CA')  ('CA', 'CB')  11.4  13.8  16.0   -     1.6   98.4
            E   ('C', 'CA')  ('CA', 'CB')  14.1  12.3  12.8   -     0.8   99.2
            H   ('C', 'CA')  ('CA', 'CB')  7.4   4.4   4.9    -     3.6   96.4

For a full list of options use: pluqin.py -h.

PIQC

Purging by Intrinsic Quality Criteria: Used to identify mis-referenced and otherwise comprised protein chemical shift data sets from the PACSY database. The results that come from running PIQC are downloadable above. Also, the maintainers of the PACSY database will run the analysis monthly and provide the output within the PACSY database. Never-the-less the programs are included in the scripts/build_pacsy directory. Please follow the direction in the readme.txt file. A graphical view of PIQC's output for proteins with ¹³C chemical shift is below:

SQAT

Coming Soon (by Feb. 12)