Examples

Start the python interpreter and load the module

import pynmrstar

Load an entry directly from the BMRB database

>>> ent15000 = pynmrstar.Entry.from_database(15000)

View the hierarchy of saveframes and loops as a tree

>>> ent15000.print_tree()
<pynmrstar.Entry '15000' from_database(15000)>
    [0] <pynmrstar.Saveframe 'entry_information'>
        [0] <pynmrstar.Loop '_Entry_author'>
        [1] <pynmrstar.Loop '_SG_project'>
        [2] <pynmrstar.Loop '_Struct_keywords'>
        [3] <pynmrstar.Loop '_Data_set'>
        [4] <pynmrstar.Loop '_Datum'>
        [5] <pynmrstar.Loop '_Release'>
        [6] <pynmrstar.Loop '_Related_entries'>
    [1] <pynmrstar.Saveframe 'citation_1'>
        [0] <pynmrstar.Loop '_Citation_author'>
    [2] <pynmrstar.Saveframe 'assembly'>
        [0] <pynmrstar.Loop '_Entity_assembly'>
    [3] <pynmrstar.Saveframe 'F5-Phe-cVHP'>
        [0] <pynmrstar.Loop '_Entity_db_link'>
        [1] <pynmrstar.Loop '_Entity_comp_index'>
        [2] <pynmrstar.Loop '_Entity_poly_seq'>
    [4] <pynmrstar.Saveframe 'natural_source'>
        [0] <pynmrstar.Loop '_Entity_natural_src'>
    [5] <pynmrstar.Saveframe 'experimental_source'>
        [0] <pynmrstar.Loop '_Entity_experimental_src'>
    [6] <pynmrstar.Saveframe 'chem_comp_PHF'>
        [0] <pynmrstar.Loop '_Chem_comp_descriptor'>
        [1] <pynmrstar.Loop '_Chem_comp_atom'>
        [2] <pynmrstar.Loop '_Chem_comp_bond'>
    [7] <pynmrstar.Saveframe 'unlabeled_sample'>
        [0] <pynmrstar.Loop '_Sample_component'>
    [8] <pynmrstar.Saveframe 'selectively_labeled_sample'>
        [0] <pynmrstar.Loop '_Sample_component'>
    [9] <pynmrstar.Saveframe 'sample_conditions'>
        [0] <pynmrstar.Loop '_Sample_condition_variable'>
    [10] <pynmrstar.Saveframe 'NMRPipe'>
        [0] <pynmrstar.Loop '_Vendor'>
        [1] <pynmrstar.Loop '_Task'>
    [11] <pynmrstar.Saveframe 'PIPP'>
        [0] <pynmrstar.Loop '_Vendor'>
        [1] <pynmrstar.Loop '_Task'>
    [12] <pynmrstar.Saveframe 'SPARKY'>
        [0] <pynmrstar.Loop '_Vendor'>
        [1] <pynmrstar.Loop '_Task'>
    [13] <pynmrstar.Saveframe 'CYANA'>
        [0] <pynmrstar.Loop '_Vendor'>
        [1] <pynmrstar.Loop '_Task'>
    [14] <pynmrstar.Saveframe 'X-PLOR_NIH'>
        [0] <pynmrstar.Loop '_Vendor'>
        [1] <pynmrstar.Loop '_Task'>
    [15] <pynmrstar.Saveframe 'spectrometer_1'>
    [16] <pynmrstar.Saveframe 'spectrometer_2'>
    [17] <pynmrstar.Saveframe 'spectrometer_3'>
    [18] <pynmrstar.Saveframe 'spectrometer_4'>
    [19] <pynmrstar.Saveframe 'spectrometer_5'>
    [20] <pynmrstar.Saveframe 'spectrometer_6'>
    [21] <pynmrstar.Saveframe 'NMR_spectrometer_list'>
        [0] <pynmrstar.Loop '_NMR_spectrometer_view'>
    [22] <pynmrstar.Saveframe 'experiment_list'>
        [0] <pynmrstar.Loop '_Experiment'>
    [23] <pynmrstar.Saveframe 'chemical_shift_reference_1'>
        [0] <pynmrstar.Loop '_Chem_shift_ref'>
    [24] <pynmrstar.Saveframe 'assigned_chem_shift_list_1'>
        [0] <pynmrstar.Loop '_Chem_shift_experiment'>
        [1] <pynmrstar.Loop '_Atom_chem_shift'>

There is a shorthand way to access saveframes by name (look at the tree for the saveframe names):

>>> ent15000['entry_information']
<pynmrstar.Saveframe 'entry_information'>

And a shorthand way to access loops by category (only one loop of a given category can exist in a saveframe):

>>> ent15000['entry_information']['_Entry_author']
<pynmrstar.Loop '_Entry_author'>

Next, we’ll load the same entry again from the database, remove a saveframe, and compare it to the original.

>>> ent15000_2 = pynmrstar.Entry.from_database(15000)
>>> del ent15000_2['entry_information']
>>> pynmrstar.diff(ent15000, ent15000_2)
The number of saveframes in the entries are not equal: '25' vs '24'.
No saveframe with name 'entry_information' in other entry.

Let’s look at a loop’s tags and its data:

>>> ent15000['entry_information']['_Entry_author'].get_tag_names()
['_Entry_author.Ordinal',
 '_Entry_author.Given_name',
 '_Entry_author.Family_name',
 '_Entry_author.First_initial',
 '_Entry_author.Middle_initials',
 '_Entry_author.Family_title',
 '_Entry_author.Entry_ID']

Get the first names of the authors using direct saveframe name and loop reference:

>>> ent15000['entry_information']['_Entry_author'].get_data_by_tag('Given_name')
[['Claudia', 'Gabriel', 'Erik', 'Samuel', 'John']]

Get the first and last names of the authors by providing multiple tags to get_data_by_tag.

>>> ent15000['entry_information']['_Entry_author'].get_data_by_tag(['Given_name', 'Family_name'])
[['Claudia', 'Cornilescu'],
 ['Gabriel', 'Cornilescu'],
 ['Erik', 'Hadley'],
 ['Samuel', 'Gellman'],
 ['John', 'Markley']]

Write the modified entry to disk in NMR-STAR format:

>>> ent15000_2.write_to_file("example_entry.str")

Get a list of validation errors. (The line numbers are only available if an entry is loaded from a file. When an entry is loaded from the API the line numbers are not preserved.)

>>> ent15000.validate()
["Value cannot be NULL but is: '_Chem_comp.Provenance':'.' on line 'None'."]

Here is how to create a loop from scratch

>>> new_loop = pynmrstar.Loop.from_scratch()
>>> new_loop.add_tag("_Example.age")
>>> new_loop.add_tag("name")
>>> new_loop.add_tag("description")
>>> new_loop.add_tag("notes")

Alternatively, you could replace above with:

>>> new_loop.add_tag(["_Example.age","name","description","notes"])

Now let’s add data to the loop

>>> new_loop.add_data([29,"Jon","A BRMB employee", None])

Notice that data is automatically encapsulated as necessary to meet the NMR-STAR format (quotes around the data containing a space). You never have to worry about encapsulating data you insert to make it syntactically valid STAR. Notice also that python None types are automatically converted to the NMR-STAR “null” value, “.”.

>>> print(new_loop)
   loop_
      _Example.age
      _Example.name
      _Example.description
      _Example.notes

     29   Jon   'A BRMB employee'   .

   stop_

Add the loop to the entry_information saveframe

>>> ent15000['entry_information'].add_loop(new_loop)

See that the loop has been added to the saveframe

>>> ent15000['entry_information'].print_tree()
<pynmrstar.Saveframe 'entry_information'>
[0] <pynmrstar.Loop '_Entry_author'>
[1] <pynmrstar.Loop '_SG_project'>
[2] <pynmrstar.Loop '_Struct_keywords'>
[3] <pynmrstar.Loop '_Data_set'>
[4] <pynmrstar.Loop '_Datum'>
[5] <pynmrstar.Loop '_Release'>
[6] <pynmrstar.Loop '_Related_entries'>
[7] <pynmrstar.Loop '_Example'>

View the value of the tag ID in the assembly saveframe

>>> ent15000['assembly'].get_tag('ID')
1

To get the NMR-STAR representation of any object, just request its string representation:

>>> print(ent15000['assembly'])
#############################################
#  Molecular system (assembly) description  #
#############################################

save_assembly
   _Assembly.Sf_category                      assembly
   _Assembly.Sf_framecode                     assembly
   _Assembly.Entry_ID                         15000
   _Assembly.ID                               1
   _Assembly.Name                             F5-Phe-cVHP
   _Assembly.BMRB_code                        .
   _Assembly.Number_of_components             1
   _Assembly.Organic_ligands                  .
   _Assembly.Metal_ions                       .
   _Assembly.Non_standard_bonds               .
   _Assembly.Ambiguous_conformational_states  .
   _Assembly.Ambiguous_chem_comp_sites        .
   _Assembly.Molecules_in_chemical_exchange   .
   _Assembly.Paramagnetic                     no
   _Assembly.Thiol_state                      'all free'
   _Assembly.Molecular_mass                   .
   _Assembly.Enzyme_commission_number         .
   _Assembly.Details                          .
   _Assembly.DB_query_date                    .
   _Assembly.DB_query_revised_last_date       .

   loop_
      _Entity_assembly.ID
      _Entity_assembly.Entity_assembly_name
      _Entity_assembly.Entity_ID
      _Entity_assembly.Entity_label
      _Entity_assembly.Asym_ID
      _Entity_assembly.PDB_chain_ID
      _Entity_assembly.Experimental_data_reported
      _Entity_assembly.Physical_state
      _Entity_assembly.Conformational_isomer
      _Entity_assembly.Chemical_exchange_state
      _Entity_assembly.Magnetic_equivalence_group_code
      _Entity_assembly.Role
      _Entity_assembly.Details
      _Entity_assembly.Entry_ID
      _Entity_assembly.Assembly_ID

     1   F5-Phe-cVHP   1   $F5-Phe-cVHP   K   .   yes   native   no   no   .   .   .   15000   1

   stop_

save_

Reading Spectral Peaks from a NMR-STAR file

First load the file and get a list of the peak list saveframes

>>> ent6577 = pynmrstar.Entry.from_database(6577)
>>> spectral_peaks = ent6577.get_saveframes_by_category('spectral_peak_list')

Lets look at how many spectral peak list saveframes we have

>>> spectral_peaks
[<pynmrstar.Saveframe 'peak_list_1'>,
 <pynmrstar.Saveframe 'peak_list_2'>,
 <pynmrstar.Saveframe 'peak_list_3'>]

For this demo we’ll just look at one individual peak list

>>> peak1 = spectral_peaks[0]

We can see what loops this peak list saveframe contains

>>> peak1.print_tree()
<pynmrstar.Saveframe 'peak_list_1'>
    [0] <pynmrstar.Loop '_Spectral_peak_software'>
    [1] <pynmrstar.Loop '_Peak_general_char'>
    [2] <pynmrstar.Loop '_Peak_char'>
    [3] <pynmrstar.Loop '_Assigned_peak_chem_shift'>

Let’s see what the _Peak_char loop looks like in NMR-STAR format

>>> print(peak1['_Peak_char'])
   loop_
      _Peak_char.Peak_ID
      _Peak_char.Spectral_dim_ID
      _Peak_char.Chem_shift_val
      _Peak_char.Chem_shift_val_err
      _Peak_char.Line_width_val
      _Peak_char.Line_width_val_err
      _Peak_char.Phase_val
      _Peak_char.Phase_val_err
      _Peak_char.Decay_rate_val
      _Peak_char.Decay_rate_val_err
      _Peak_char.Coupling_pattern
      _Peak_char.Bounding_box_upper_val
      _Peak_char.Bounding_box_lower_val
      _Peak_char.Bounding_box_range_val
      _Peak_char.Details
      _Peak_char.Derivation_method_ID
      _Peak_char.Entry_ID
      _Peak_char.Spectral_peak_list_ID

     1      1   9.857   .   .   .   .   .   .   .   .   .   .   .   .   .   6577   1
     1      2   4.922   .   .   .   .   .   .   .   .   .   .   .   .   .   6577   1
     2      1   9.857   .   .   .   .   .   .   .   .   .   .   .   .   .   6577   1
     2      2   2.167   .   .   .   .   .   .   .   .   .   .   .   .   .   6577   1
     ...etc...

That is more information than we want right now. Lets get just the columns we need (we’ll get a list of lists, each inner list corresponds to a row:

>>> our_data = peak1['_Peak_char'].get_data_by_tag(['Peak_ID','Chem_shift_val'])
>>> print(our_data)
[['1', '9.857'],
 ['1', '4.922'],
 ['2', '9.857'],
 ['2', '2.167'],
 ['3', '9.855'],
 ['3', '1.994'],
 ...]

Excellent! Now we can iterate through each spectral peak and corresponding shift easily. The data is stored as a python list of lists (2 dimensional array) and we can modify or access it any of the normal ways python allows.

>>> for x in our_data:
>>>    print("Spectral chemical shift value is: " + str(x[1]))
Spectral chemical shift value is: 9.857
Spectral chemical shift value is: 4.922
Spectral chemical shift value is: 9.857
...

It is also easy to dump the table in a loop as a CSV

>>> print(peak1['_Peak_char'].get_data_as_csv())
_Peak_char.Peak_ID,_Peak_char.Spectral_dim_ID,_Peak_char.Chem_shift_val,_Peak_char.Chem_shift_val_err,_Peak_char.Line_width_val,_Peak_char.Line_width_val_err,_Peak_char.Phase_val,_Peak_char.Phase_val_err,_Peak_char.Decay_rate_val,_Peak_char.Decay_rate_val_err,_Peak_char.Coupling_pattern,_Peak_char.Bounding_box_upper_val,_Peak_char.Bounding_box_lower_val,_Peak_char.Bounding_box_range_val,_Peak_char.Details,_Peak_char.Derivation_method_ID,_Peak_char.Entry_ID,_Peak_char.Spectral_peak_list_ID
1,1,9.857,.,.,.,.,.,.,.,.,.,.,.,.,.,6577,1
1,2,4.922,.,.,.,.,.,.,.,.,.,.,.,.,.,6577,1
2,1,9.857,.,.,.,.,.,.,.,.,.,.,.,.,.,6577,1
2,2,2.167,.,.,.,.,.,.,.,.,.,.,.,.,.,6577,1
3,1,9.855,.,.,.,.,.,.,.,.,.,.,.,.,.,6577,1
3,2,1.994,.,.,.,.,.,.,.,.,.,.,.,.,.,6577,1
...