PyNMR-STAR quick start ====================== First, pull up an interactive python session and import the module: .. code:: python import pynmrstar There are many ways to load an NMR-STAR entry, but lets focus on the most common two. From the BMRB API (loads the most up to date version of an entry from the BMRB API): .. code:: python entry15000 = pynmrstar.Entry.from_database(15000) From a file: .. code:: python entry = pynmrstar.Entry.from_file("/location/of/the/file.str") Continuing on we will assume you have loaded entry 15000 from the API using the from\_database command. Writing out a modified entry or saveframe to file is just as easy: .. code:: python entry15000.write_to_file("output_file_name.str") Viewing the structure of the entry ~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~ To see the overall structure of the entry, use the ``print_tree()`` method. .. code:: python entry15000.print_tree() [0] [0] [1] [2] [3] [4] [5] [6] [1] [0] [2] [0] [3] [0] [1] [2] [4] [0] [5] [0] [6] [0] [1] [2] [7] [0] [8] [0] [9] [0] [10] [0] [1] [11] [0] [1] [12] [0] [1] [13] [0] [1] [14] [0] [1] [15] [16] [17] [18] [19] [20] [21] [0] [22] [0] [23] [0] [24] [0] [1] You can see that there are 25 saveframes, and each saveframe contains some number of loops. Accessing saveframes and loops ~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~ There are several ways to access saveframes and loops depending on what you hope to accomplish. The interactive session way ^^^^^^^^^^^^^^^^^^^^^^^^^^^ When playing with the library, debugging, or learning about NMR-STAR you will most likely find the following method most convenient. Note that it is not the correct pattern to use if you want to iterate all of the data in an entry (for reasons that will be explained below). You can access the saveframes in an entry directly using their *names*. For example, to get a reference to the spectrometer saveframe named ``spectrometer_1`` you can simply do the following: .. code:: python a_spectrometer = entry15000['spectrometer_1'] Note that you can see the saveframe names in the tree printout above. You can do the same for loops within a saveframe, but for loops you must use their tag category (the part before the period) to access them (note that to get to the ``Vendor`` loop we first had to go through its parent saveframe, named ``X-PLOR_NIH`` (the ``X-PLOR_NIH`` saveframe is of the category ``software`` - you'll see where you access the category later and why accessing by category is preferable). .. code:: python explor_nih_vendor = entry15000['X-PLOR_NIH']['_Vendor'] print(explor_nih_vendor) loop_ _Vendor.Name _Vendor.Address _Vendor.Electronic_address _Vendor.Entry_ID _Vendor.Software_ID 'CD Schwieters, JJ Kuszewski, N Tjandra and GM Clore' . . 15000 5 stop_ These shortcuts are there for your convenience when writing code. The reason you shouldn't use them in production code is because the saveframe names - what you use as a reference - can actually have any arbitrary value. They are fairly consistent, and for certain saveframes are always the same, but for other saveframes users can set them to whatever value they want during the deposition. Therefore the much better way to access data is via the *category*. Note that only one saveframe in an entry can have a given name, but multiple saveframes may be of the same category. The ``_`` prior to the ``Vendor`` loop category is to make it clear you want to access the loop and not a saveframe tag with the name ``Vendor``. The robust (and recommended) way ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^ A better way to access data is via the category of the data you want to read, or by searching for it with a full tag name. Before going into detail, take a look at what one saveframe from the entry above looks like: .. code:: text ############################ # Computer software used # ############################ save_X-PLOR_NIH _Software.Sf_category software _Software.Sf_framecode X-PLOR_NIH _Software.Entry_ID 15000 _Software.ID 5 _Software.Name 'X-PLOR NIH' _Software.Version . _Software.Details . loop_ _Vendor.Name _Vendor.Address _Vendor.Electronic_address _Vendor.Entry_ID _Vendor.Software_ID 'CD Schwieters, JJ Kuszewski, N Tjandra and GM Clore' . . 15000 5 stop_ loop_ _Task.Task _Task.Entry_ID _Task.Software_ID refinement 15000 5 'structure solution' 15000 5 stop_ save_ This is a saveframe describing software that was used during an NMR study. You can see from the saveframe tags that the name of this software package is X-PLOR-NIH. You can see from the tag ``ID`` that it is the fifth software saveframe in this entry. The category of this saveframe is "software" which you can see in the ``Sf_category`` (short for saveframe category) tag. This saveframe also has two loops, a vendor loop and a task loop. These are loops rather than free tags as a given software package can have more than one vendor and more than one task it performs. Reading the software packages ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^ The more robust way to access the data in the software saveframes is by iterating over all of the software saveframes in the entry and pulling out the data we want. To do this for software, we would write the following: .. code:: python software_saveframes = entry15000.get_saveframes_by_category('software') software_saveframes [, , , , ] You can see that this method, ``get_saveframes_by_category`` returned all of the software saveframes in the entry. Now we can iterate through them to either pull out data, modify data, or remove data. (One note, each loop category - the text before the period in the loop tags - is unique to its parent saveframe. Therefore you will never find a ``Task`` loop in a saveframe with a category of anything other than ``software``. Furthermore, a saveframe can only have one loop of a given category. This means that accessing loops within a saveframe using the category notation is robust and will not lead to you missing a loop.) The following will combine all the task loops in the entry into CSV format. .. code:: python csv_data = "" print_header = True for software_sf in software_saveframes: # Wrap this in try/catch because it is not guaranteed a software saveframe will have a task loop try: csv_data += software_sf['_Task'].get_data_as_csv(header=print_header) print_header = False except KeyError: continue print(csv_data) '_Task.Task,_Task.Entry_ID,_Task.Software_ID\nprocessing,15000,1\n_Task.Task,_Task.Entry_ID,_Task.Software_ID\nchemical shift assignment,15000,2\ndata analysis,15000,2\npeak picking,15000,2\n_Task.Task,_Task.Entry_ID,_Task.Software_ID\nchemical shift assignment,15000,3\n_Task.Task,_Task.Entry_ID,_Task.Software_ID\nstructure solution,15000,4\n_Task.Task,_Task.Entry_ID,_Task.Software_ID\nrefinement,15000,5\nstructure solution,15000,5\n' Using get\_tag to pull tags directly from an entry ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^ Another way to access data in by using the full tag name. Keep in mind that a full tag contains a category first, a period, and then a tag name. So if we wanted to see all of the various ``_Task.Task`` that the software packages associated with this entry performed, a simple way to do so is with the ``get_tag()`` method of the entry: .. code:: python entry15000.get_tag('Task.Task') ['processing', 'chemical shift assignment', 'data analysis', 'peak picking', 'chemical shift assignment', 'structure solution', 'refinement', 'structure solution'] Or to get all of the spectrometer information - ``get_tags()`` accepts a list of tags to fetch and returns a dictionary pointing to all the values of each tag, with the order preserved: .. code:: python entry15000.get_tags(['_NMR_spectrometer.Manufacturer', '_NMR_spectrometer.Model', '_NMR_spectrometer.Field_strength']) {'_NMR_spectrometer.Field_strength': ['500', '500', '750', '600', '800', '900'], '_NMR_spectrometer.Manufacturer': ['Bruker', 'Bruker', 'Bruker', 'Varian', 'Varian', 'Varian'], '_NMR_spectrometer.Model': ['Avance', 'Avance', 'Avance', 'INOVA', 'INOVA', 'INOVA']} To view all of the tags in the NMR-STAR schema and their meanings, please go `here `__. # Assigned Chemical Shifts *"I just want to get the chemical shift data as an array - how do I do that?"* Keep in mind that an entry may have multiple sets of assigned chemical shifts. (For examples, there may be two sets of assignments that were made under two different sample conditions.) So to get the chemical shifts it is best to iterate through all the assigned chemical shift loops: .. code:: python cs_result_sets = [] for chemical_shift_loop in entry15000.get_loops_by_category("Atom_chem_shift"): cs_result_sets.append(chemical_shift_loop.get_tag(['Comp_index_ID', 'Comp_ID', 'Atom_ID', 'Atom_type', 'Val', 'Val_err'])) cs_result_sets [[['2', 'SER', 'H', 'H', '9.3070', '0.01'], ['2', 'SER', 'HA', 'H', '4.5970', '0.01'], ['2', 'SER', 'HB2', 'H', '4.3010', '0.01'], ['2', 'SER', 'HB3', 'H', '4.0550', '0.01'], ['2', 'SER', 'CB', 'C', '64.6000', '0.1'], ['2', 'SER', 'N', 'N', '121.5800', '0.1'], ['3', 'ASP', 'H', 'H', '8.0740', '0.01'], ['3', 'ASP', 'HA', 'H', '4.5580', '0.01'], ['3', 'ASP', 'HB2', 'H', '2.835', '0.01'], ... Note that we used the ``get_tag()`` method of the loop to only pull out the tags we were concerned with. ``get_tag()`` accepts an array of tags in addition to a single tag. The full assigned chemical saveframe loop will contain extra tags you may not need. For example: .. code:: python print(entry15000.get_loops_by_category("Atom_chem_shift")[0]) loop_ _Atom_chem_shift.ID _Atom_chem_shift.Assembly_atom_ID _Atom_chem_shift.Entity_assembly_ID _Atom_chem_shift.Entity_ID _Atom_chem_shift.Comp_index_ID _Atom_chem_shift.Seq_ID _Atom_chem_shift.Comp_ID _Atom_chem_shift.Atom_ID _Atom_chem_shift.Atom_type _Atom_chem_shift.Atom_isotope_number _Atom_chem_shift.Val _Atom_chem_shift.Val_err _Atom_chem_shift.Assign_fig_of_merit _Atom_chem_shift.Ambiguity_code _Atom_chem_shift.Occupancy _Atom_chem_shift.Resonance_ID _Atom_chem_shift.Auth_entity_assembly_ID _Atom_chem_shift.Auth_asym_ID _Atom_chem_shift.Auth_seq_ID _Atom_chem_shift.Auth_comp_ID _Atom_chem_shift.Auth_atom_ID _Atom_chem_shift.Details _Atom_chem_shift.Entry_ID _Atom_chem_shift.Assigned_chem_shift_list_ID 1 . 1 1 2 2 SER H H 1 9.3070 0.01 . . . . . . 2 SER H . 15000 1 2 . 1 1 2 2 SER HA H 1 4.5970 0.01 . . . . . . 2 SER HA . 15000 1 3 . 1 1 2 2 SER HB2 H 1 4.3010 0.01 . . . . . . 2 SER HB2 . 15000 1 ... *"But I want to access the chemical shifts as numbers, not strings!"* That is easy to do. When you first load an entry it is by default loaded with all values as strings. To instead load it such that the values match the schema, simply set ``convert_data_types=True`` when you load the file using ``Entry.from_file()``. .. code:: python entry15000 = pynmrstar.Entry.from_database(15000, convert_data_types=True) cs_result_sets = [] for chemical_shift_loop in entry15000.get_loops_by_category("Atom_chem_shift"): cs_result_sets.append(chemical_shift_loop.get_tag(['Comp_index_ID', 'Comp_ID', 'Atom_ID', 'Atom_type', 'Val', 'Val_err'])) print(cs_result_sets) [[[2, 'SER', 'H', 'H', Decimal('9.3070'), Decimal('0.01')], [2, 'SER', 'HA', 'H', Decimal('4.5970'), Decimal('0.01')], [2, 'SER', 'HB2', 'H', Decimal('4.3010'), Decimal('0.01')], [2, 'SER', 'HB3', 'H', Decimal('4.0550'), Decimal('0.01')], [2, 'SER', 'CB', 'C', Decimal('64.6000'), Decimal('0.1')], [2, 'SER', 'N', 'N', Decimal('121.5800'), Decimal('0.1')], [3, 'ASP', 'H', 'H', Decimal('8.0740'), Decimal('0.01')], [3, 'ASP', 'HA', 'H', Decimal('4.5580'), Decimal('0.01')], [3, 'ASP', 'HB2', 'H', Decimal('2.835'), Decimal('0.01')], [3, 'ASP', 'HB3', 'H', Decimal('2.754'), Decimal('0.01')], [3, 'ASP', 'CA', 'C', Decimal('57.6400'), Decimal('0.1')], [3, 'ASP', 'N', 'N', Decimal('121.1040'), Decimal('0.1')], ... This is a great opportunity to point out that if all you want is the chemical shifts, or one or two tags, you may find it significantly easier to use the `BMRB API `__ (`chemical shift endpoint `__) to fetch that data directly and on-demand rather than dealing directly with NMR-STAR at all. Creating new loops and saveframes ~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~ This tutorial has so far focused on how to read and access data. This section will focus on how to create new loop and saveframe objects. Loops ^^^^^ There are five ways to make a new loop: ``from_file()``, ``from_json()``, ``from_scratch()``, ``from_string()``, and ``from_template()``. All of these are classmethods. ``from_scratch()`` makes a new loop, ``from_string()`` parses an NMR-STAR loop from a python string containing NMR-STAR data, ``from_json()`` parses a JSON object (reversely, ``get_json()`` will get a JSON representation of the loop), ``from_scratch()`` makes a completely empty loop, and ``from_template()`` makes a loop with the tags pre-filled from the BMRB schema based on the provided category. ``from_file``, ``from_json``, and ``from_string`` are fairly self-explanatory - see the full documentation if needed for usage. ``from_scratch()`` ------------------ .. code:: python lp = pynmrstar.Loop.from_scratch() print(lp) loop_ stop_ lp.add_tag(['loop_category.tag1', 'loop_category.tag2', 'loop_category.tag3']) print(lp) loop_ _loop_category.tag1 _loop_category.tag2 _loop_category.tag3 stop_ # Note that when calling add_data the length of the array must match the number of tags in the loop lp.add_data(['value_1', 2, 'value 3']) print(lp) loop_ _loop_category.tag1 _loop_category.tag2 _loop_category.tag3 value_1 2 'value 3' stop_ # Alternatively, you can (with caution) directly modify the array corresponding to the loop data lp.data = [[1,2,3],[4,5,6]] print(lp) loop_ _loop_category.tag1 _loop_category.tag2 _loop_category.tag3 1 2 3 4 5 6 stop_ Note that the loop category was set automatically when the tag ``loop_category.tag1`` was added. You could have also provided the tag when creating the loop by providing it as an argument to the optional ``category`` argument to the constructor. ``from_template()`` ------------------- This method will create a new loop ready for data with the tags from the BMRB schema corresponding to that loop category. .. code:: python chemical_shifts = pynmrstar.Loop.from_template('atom_chem_shift_list') print(chemical_shifts) loop_ _Atom_chem_shift.ID _Atom_chem_shift.Assembly_atom_ID _Atom_chem_shift.Entity_assembly_ID _Atom_chem_shift.Entity_ID _Atom_chem_shift.Comp_index_ID _Atom_chem_shift.Seq_ID _Atom_chem_shift.Comp_ID _Atom_chem_shift.Atom_ID _Atom_chem_shift.Atom_type _Atom_chem_shift.Atom_isotope_number _Atom_chem_shift.Val _Atom_chem_shift.Val_err _Atom_chem_shift.Assign_fig_of_merit _Atom_chem_shift.Ambiguity_code _Atom_chem_shift.Ambiguity_set_ID _Atom_chem_shift.Occupancy _Atom_chem_shift.Resonance_ID _Atom_chem_shift.Auth_entity_assembly_ID _Atom_chem_shift.Auth_asym_ID _Atom_chem_shift.Auth_seq_ID _Atom_chem_shift.Auth_comp_ID _Atom_chem_shift.Auth_atom_ID _Atom_chem_shift.Details _Atom_chem_shift.Entry_ID _Atom_chem_shift.Assigned_chem_shift_list_ID stop_ Saveframes ^^^^^^^^^^ There are five ways to make a new saveframe: ``from_file()``, ``from_json()``, ``from_scratch()``, ``from_string()``, and ``from_template()``. All of these are classmethods. ``from_scratch()`` makes a new saveframe, ``from_string()`` parses an NMR-STAR saveframe from a python string containing NMR-STAR data, ``from_json()`` parses a JSON object (reversely, ``get_json()`` will get a JSON representation of the saveframe), ``from_scratch()`` makes a completely empty saveframe, and ``from_template()`` makes a saveframe with the tags pre-filled from the BMRB schema based on the provided category. ``from_file``, ``from_json``, and ``from_string`` are fairly self-explanatory - see the full documentation if needed for usage. ``from_scratch()`` ------------------ .. code:: python # You must provide the saveframe name (the value that comes after "save_" at the start of the saveframe and saveframe # tag prefix (the value before the "." in a tag name) when creating a saveframe this way my_sf = pynmrstar.Saveframe.from_scratch("sf_name", "example_sf_category") print(my_sf) save_sf_name save_ # Add a tag using the add_tag() method. Update=True will override existing tag with the same name. # Update=False will raise an exception if the tag already exists my_sf.add_tag("tagName1", "tagValue1") print(my_sf) save_sf_name _example_sf_category.tagName1 tagValue1 save_ my_sf.add_tag("tagName1", "tagValue2", update=False) ValueError: There is already a tag with the name 'tagName1'. my_sf.add_tag("tagName1", "tagValue2", update=True) print(my_sf) save_sf_name _example_sf_category.tagName1 tagValue1 save_ # Alternatively, you can access or write tag values using direct subset access: my_sf['tagName1'] ['tagValue2'] my_sf['tagName2'] = "some value" print(my_sf) save_sf_name _example_sf_category.tagName1 tagValue2 _example_sf_category.tagName2 'some value' save_ # Now add the loop we created before my_sf.add_loop(lp) print(my_sf) save_sf_name _example_sf_category.tagName1 tagValue2 _example_sf_category.tagName2 'some value' loop_ _loop_category.tag1 _loop_category.tag2 _loop_category.tag3 1 2 3 4 5 6 stop_ save_ # Now write out our saveframe to a file. Optionally specify format="json" to write in JSON format. my_sf.write_to_file("file_name.str") my_sf.write_to_file("file_name.json", format_="json") ``from_template()`` ------------------- .. code:: python my_sf = pynmrstar.Saveframe.from_template("assigned_chemical_shifts") print(my_sf) ################################### # Assigned chemical shift lists # ################################### ################################################################### # Chemical Shift Ambiguity Index Value Definitions # # # # The values other than 1 are used for those atoms with different # # chemical shifts that cannot be assigned to stereospecific atoms # # or to specific residues or chains. # # # # Index Value Definition # # # # 1 Unique (including isolated methyl protons, # # geminal atoms, and geminal methyl # # groups with identical chemical shifts) # # (e.g. ILE HD11, HD12, HD13 protons) # # 2 Ambiguity of geminal atoms or geminal methyl # # proton groups (e.g. ASP HB2 and HB3 # # protons, LEU CD1 and CD2 carbons, or # # LEU HD11, HD12, HD13 and HD21, HD22, # # HD23 methyl protons) # # 3 Aromatic atoms on opposite sides of # # symmetrical rings (e.g. TYR HE1 and HE2 # # protons) # # 4 Intraresidue ambiguities (e.g. LYS HG and # # HD protons or TRP HZ2 and HZ3 protons) # # 5 Interresidue ambiguities (LYS 12 vs. LYS 27) # # 6 Intermolecular ambiguities (e.g. ASP 31 CA # # in monomer 1 and ASP 31 CA in monomer 2 # # of an asymmetrical homodimer, duplex # # DNA assignments, or other assignments # # that may apply to atoms in one or more # # molecule in the molecular assembly) # # 9 Ambiguous, specific ambiguity not defined # # # ################################################################### save_assigned_chemical_shifts _Assigned_chem_shift_list.Sf_category assigned_chemical_shifts _Assigned_chem_shift_list.Sf_framecode assigned_chemical_shifts _Assigned_chem_shift_list.Entry_ID . _Assigned_chem_shift_list.ID . _Assigned_chem_shift_list.Sample_condition_list_ID . _Assigned_chem_shift_list.Sample_condition_list_label . _Assigned_chem_shift_list.Chem_shift_reference_ID . _Assigned_chem_shift_list.Chem_shift_reference_label . _Assigned_chem_shift_list.Chem_shift_1H_err . _Assigned_chem_shift_list.Chem_shift_13C_err . _Assigned_chem_shift_list.Chem_shift_15N_err . _Assigned_chem_shift_list.Chem_shift_31P_err . _Assigned_chem_shift_list.Chem_shift_2H_err . _Assigned_chem_shift_list.Chem_shift_19F_err . _Assigned_chem_shift_list.Error_derivation_method . _Assigned_chem_shift_list.Details . _Assigned_chem_shift_list.Text_data_format . _Assigned_chem_shift_list.Text_data . loop_ _Chem_shift_experiment.Experiment_ID _Chem_shift_experiment.Experiment_name _Chem_shift_experiment.Sample_ID _Chem_shift_experiment.Sample_label _Chem_shift_experiment.Sample_state _Chem_shift_experiment.Entry_ID _Chem_shift_experiment.Assigned_chem_shift_list_ID stop_ loop_ _Systematic_chem_shift_offset.Type _Systematic_chem_shift_offset.Atom_type _Systematic_chem_shift_offset.Atom_isotope_number _Systematic_chem_shift_offset.Val _Systematic_chem_shift_offset.Val_err _Systematic_chem_shift_offset.Entry_ID _Systematic_chem_shift_offset.Assigned_chem_shift_list_ID stop_ loop_ _Chem_shift_software.Software_ID _Chem_shift_software.Software_label _Chem_shift_software.Method_ID _Chem_shift_software.Method_label _Chem_shift_software.Entry_ID _Chem_shift_software.Assigned_chem_shift_list_ID stop_ loop_ _Atom_chem_shift.ID _Atom_chem_shift.Assembly_atom_ID _Atom_chem_shift.Entity_assembly_ID _Atom_chem_shift.Entity_ID _Atom_chem_shift.Comp_index_ID _Atom_chem_shift.Seq_ID _Atom_chem_shift.Comp_ID _Atom_chem_shift.Atom_ID _Atom_chem_shift.Atom_type _Atom_chem_shift.Atom_isotope_number _Atom_chem_shift.Val _Atom_chem_shift.Val_err _Atom_chem_shift.Assign_fig_of_merit _Atom_chem_shift.Ambiguity_code _Atom_chem_shift.Ambiguity_set_ID _Atom_chem_shift.Occupancy _Atom_chem_shift.Resonance_ID _Atom_chem_shift.Auth_entity_assembly_ID _Atom_chem_shift.Auth_asym_ID _Atom_chem_shift.Auth_seq_ID _Atom_chem_shift.Auth_comp_ID _Atom_chem_shift.Auth_atom_ID _Atom_chem_shift.Details _Atom_chem_shift.Entry_ID _Atom_chem_shift.Assigned_chem_shift_list_ID stop_ loop_ _Ambiguous_atom_chem_shift.Ambiguous_shift_set_ID _Ambiguous_atom_chem_shift.Atom_chem_shift_ID _Ambiguous_atom_chem_shift.Entry_ID _Ambiguous_atom_chem_shift.Assigned_chem_shift_list_ID stop_ save_ Schema methods ~~~~~~~~~~~~~~ The library makes it easy to add missing tags, sort the tags according to the BMRB schema, and validate the data against the schema. Let's do a simple example of creating a chemical shift loop, adding any missing tags, ordering the tags in the standard order (not required), and then checking for errors. .. code:: python # Create the loop with the proper category my_cs_loop = pynmrstar.Loop.from_scratch("Atom_chem_shift") # Add the tags we will fill my_cs_loop.add_tag(['Comp_ID', 'Atom_ID', 'Comp_index_ID', 'Atom_type', 'Val', 'Val_err']) print(my_cs_loop) loop_ _Atom_chem_shift.Comp_ID _Atom_chem_shift.Atom_ID _Atom_chem_shift.Comp_Index_ID _Atom_chem_shift.Atom_type _Atom_chem_shift.Val _Atom_chem_shift.Val_err stop_ # Populate the data array my_cs_loop.data = [['SER', 'H', '2', 'H', '9.3070', '0.01'], ['SER', 'HA', '2', 'H', '4.5970', '0.01'], ['SER', 'HB2', '2', 'H', '4.3010', '0.01']] print(my_cs_loop) loop_ _Atom_chem_shift.Comp_ID _Atom_chem_shift.Atom_ID _Atom_chem_shift.Comp_Index_ID _Atom_chem_shift.Atom_type _Atom_chem_shift.Val _Atom_chem_shift.Val_err SER H 2 H 9.3070 0.01 SER HA 2 H 4.5970 0.01 SER HB2 2 H 4.3010 0.01 stop_ # Now lets sort the tags to match the BMRB schema my_cs_loop.sort_tags() # You can see that the Comp_index_ID tag has been moved to the front to match the BMRB standard print(my_cs_loop) loop_ _Atom_chem_shift.Comp_index_ID _Atom_chem_shift.Comp_ID _Atom_chem_shift.Atom_ID _Atom_chem_shift.Atom_type _Atom_chem_shift.Val _Atom_chem_shift.Val_err 2 SER H H 9.3070 0.01 2 SER HA H 4.5970 0.01 2 SER HB2 H 4.3010 0.01 stop_ # Check for any errors - returns a list of errors. No errors here: print(my_cs_loop.validate()) [] # Let us now set 'Comp_index_ID' to have an invalid value my_cs_loop.data[0][0] = "invalid" # You can see that there is now a validation error - the data doesn't match the specified type print(my_cs_loop.validate()) ["Value does not match specification: '_Atom_chem_shift.Comp_index_ID':'invalid' on line '0 tag 0 of loop'.\n Type specified: int\n Regular expression for type: '-?[0-9]+'"] # If you use the pynmrstar.validate(object) function, it will print the report in a human-readable format pynmrstar.validate(my_cs_loop) 1: Value does not match specification: '_Atom_chem_shift.Comp_index_ID':'invalid' on line '0 tag 0 of loop'. Type specified: int Regular expression for type: '-?[0-9]+' # Finally, add in any tags that you didn't have a value for my_cs_loop.add_missing_tags() # You can see that all the standard "Atom_chem_shift" loop tags have been added, and their values # all set to a logical null value - "." print(my_cs_loop) loop_ _Atom_chem_shift.ID _Atom_chem_shift.Assembly_atom_ID _Atom_chem_shift.Entity_assembly_ID _Atom_chem_shift.Entity_ID _Atom_chem_shift.Comp_index_ID _Atom_chem_shift.Seq_ID _Atom_chem_shift.Comp_ID _Atom_chem_shift.Atom_ID _Atom_chem_shift.Atom_type _Atom_chem_shift.Atom_isotope_number _Atom_chem_shift.Val _Atom_chem_shift.Val_err _Atom_chem_shift.Assign_fig_of_merit _Atom_chem_shift.Ambiguity_code _Atom_chem_shift.Ambiguity_set_ID _Atom_chem_shift.Occupancy _Atom_chem_shift.Resonance_ID _Atom_chem_shift.Auth_entity_assembly_ID _Atom_chem_shift.Auth_asym_ID _Atom_chem_shift.Auth_seq_ID _Atom_chem_shift.Auth_comp_ID _Atom_chem_shift.Auth_atom_ID _Atom_chem_shift.Details _Atom_chem_shift.Entry_ID _Atom_chem_shift.Assigned_chem_shift_list_ID . . . . invalid . SER H H . 9.3070 0.01 . . . . . . . . . . . . . . . . . 2 . SER HA H . 4.5970 0.01 . . . . . . . . . . . . . . . . . 2 . SER HB2 H . 4.3010 0.01 . . . . . . . . . . . . . stop_ Go on to the full :doc:`class definitions ` or see more :doc:`examples `.