Data Classes
The Container Class
Creating a New File Type
File content classes and contentFlag and subType flags

CCP4I2 Developers Data Classes and Containers

Data Classes

All data classes are sub-classed from CCP4Data.CData. The same classes are used by the gui and pipelines. The CCP4Data.CData (in core directory) currently subclasses eitherCCP4QtObject.CObject or CCP4Object.CObject. The first of these subclasses the Qt class QtCore.QObject which provides signals and slots functionality. The second is intended as an alternative to Qt dependence but is not yet implemented.

The CCP4Data.CBaseData class sub-classes CData and is the base class for all 'simple' data classes that hold one item of data.

Currently the data classes are in: There is a GUI widget for each data class - there is not necessarilly a simple ono-to-one relationship - each widget class has a MODEL_CLASS parameter which lists which data classes it can represent. The gui classes are in the qtgui directory.

To enable easy XML input/output of data all data classes have methods to get and set an eTree representation of their data content. The eTree functionality is provided by lxml - see eTree representation is easily imported from or exported as XML.

All CData classes have a few class-wide attributes that are usually defined at the top of the class. These are:
CONTENTSDictionary, keys are sub-object names and values are dictionaries with at least a class key.The contents of 'complex' classes
PYTHONTYPEA Python typeThe Python type for simple data classes
QUALIFIERSDictionary, keys are qualifier names, and values are appropriate type for the qualifier.Specify the default qualifiers for this class. This can be over-ridden for any instance of the class.
QUALIFIERS_ORDERA list of qualifier namesThe order qualifiers will apppear in defEd GUI.
QUALIFIERS_DEFINITIONDictionary, keys are qualifier names, values are a dictionary with at least keys: type - a Python type and description - a brief description for defEd GUI.
ERROR_CODESDictionaryDescribed above
PROPERTIESDictionary with keys are property name and value is dictionary with fget and fset keys.Analogous to the Python property command. This must be placed after the methods for fget and fset.

These attributes should be accessed by methods such as contents() and qualifier() rather than accessed directly. The access methods implement the principle that sub-classes inherit the properties of base classes and that the QUALIFIER attributes can be over-ridden in particular class instances.

The following table explains the main class methods - it is not comprehensive documentation.

CData methodArgumentDescription
parent The Qt system expects each QObject to have a parent QObject with the first ancestor object being the Qt QApplication. The CCP4Modules.QTAPPLICATION() function will return the QApplication if you need it but the usual parent of CData should be the CContainer that holds the data.
qualifiers A dictionary of parameters to 'fine-tune' the class - particularly specifying validation parameters such as maximum or minimum allowed values. This mechanism is intended to reduce the need for sub-classing. The allowed values can be seen in the __init__ method. The CData base class expects two parameters: a name and a default value. The name is the same as the element tag in the XML representation of the class. The default default value is the Python None (i.e. not set).
parent Return the parent object.
objectName Return the object's name.
objectPath Return a 'path' name incuding parent object names. An underscore separator is used.
getEtree Return an eTree element representing the data object.
setEtreeParse an eTree element and initialise the data object.
elementAn etree element
setSet the data. The method calls the validity method and will only set the data if the validity is OK. setData returns a CErrorReport from validity().
dataThe set method attempts to be very tolerant of input e.g. i = CInt(); i.set('12') is acceptable. Simple, one item, data objects expect the appropriate Python type or the appropriate CData class as argument to set. Complex, multi item, data objects, expect Python dictionaries or the different components on the command line. See the example code
unSetUnset the data
getReturn the data contents on the object. For a simple class with single item then that is returned; if the data has multiple items then they are returned as a Python dictionary.
nameOptional name of one item in a multi-item data class. If this is set then only the named item is returned.
validityReturn a CErrorReport indicating validity of input data.
fixReturn a dictionary representation of the class data that has been fixed (possibly with significant change to content) to be valid. This method is only useful where it has been reimplemented in some sub-classes.
isSetReturn False/ True dependent on whether the data is set or is a null value.
emitDataChangedA wrapper for the Qt QObject.emit() method to emit a 'dataChanged' signal.
setQualifiersEtreeParse an etree element containing the qualifiers for the class. This is used when the data type definitions are read from an XML file. Return a CErrorReport
elementetree element.
getQualifiersEtreeReturn a tuple of an eTree element containing the data qualifiers for the class and a CErrorReport
contentsReturn a dictionary specifying the contents of a complex class
nameName of one item in the class. If this is set then returns definition of just that item.
qualifiersReturn a dictionary of qualifier values for the class.
nameThe name of a qualifier. Return the value of that qualifier only.
defaultIf False do not return the qualifiers that have the default value for the class
customIf False do not return the qualifiers that have the been customised in this instance of the class
qualifiersDefinitionReturn the definition of the qualifiers for the class
pythonTypeReturn the Python type equivalent for a simple class
nameThe name of a qualifier. Return the definition of that qualifier only.
qualifiersOrderReturn the order of the qualifiers used in a GUI.
setDefaultSet the data to the default value.

The Container Class

CCP4Container.CContainer is a sub-class of CData which differs mostly in that its data contents (for mostCData defined bt the CONTENTS attribute) are defined at run time. The container holds a set of data objects in a Python dictionary. Typically one container is associated with one gui window or one program wrapper. The data container classes are not sub-classed - their contents are specified when the class is instantiated. The data objects within the container apply their validity() method to ensure that all loaded data is valid. A data container may contain sub-containers.

Typically the content of a CContainer is defined in a DEF (extension def.xml) file and its data is imported from and exported to PARAMS (extension params.xml) files. When the content definition is read the data objects to hold the data are created automatically.

Containers are the Python representation of PARAMS files and these files contain a header holding meta-data such as creation date and project id. The header can be accessed as myContainer.header and is a CCP4File:CI2XmlHeader class. Tasks developers should not normally need to access this data as it is handled by the CCP4i2 core.

loadContentsFromXml-Load the content definition from a DEF file. Returns a CErrorReport.
-fileNameThe full path name of a file
loadDataFromXml-Load the data values from a PARAMS file. Should only be used after the contents have been defined.Returns a CErrorReport.
-fileNameThe full path name of a file
saveContentsToXmlSave the content definition to a DEF file.Returns a CErrorReport.
-fileNameThe full path name of a file
saveDataToXmlSave the data values to a PARAMS file.Returns a CErrorReport.
-fileNameThe full path name of a file
loadContentsFromEtreeLoad the content definition from an eTree element
-elementAn eTree element
loadDataFromEtree-Load the data values from an eTree element. Should only be used after the contents have been defined.
elementAn eTree element
saveContentsToEtree-Return a Python tuple of an eTree element representing the contents and a CErrorReport.
saveDataToEtree-Return a Python tuple of an eTree element representing the data values and a CErrorReport.
addContentThe arguments to this method define a data object. The method creates a new object and appends it to the container.
-nameA name for the new data object
-clsThe class of the new data object
-qualifiersA Python dictionary of qualifiiers for the new object
addObjectAdd an already existing data object ot the container
-nameA name for the new data object
-objectThe CData object
-afterObjectInsert the new object in the container after the object with this name.
replaceObjectReplace a data object with given name by another existing object
nameThe name of an existing data object in the container
objectA CData object
deleteObjectDelete an object with given name
nameThe name of an existing data object in the container
renameObjectRename an object
oldNameThe name of an existing data object in the container
newNameA new name
clearremove all content from the comtainer
dataOrderReturn a list of all the names of data objects (and sub-containers) in the container
addHeaderAdd a CCP4File.CHeader header to the container. This header can be set appropriately prior to exporting an XML file.
parseCommandLineLoad data into the container by parsing a 'command line' formatted asa list of words
commandLineA list of words such as returned by sys.argv()
templateOptional template for interpreting command line (more info)

Creating a New File Type

This is a brief how-to - mostly to remind Liz.

To register a new file type in CCP4i2 it needs to be added in three different places (no, that's not good but at least its documented!):

A simple file class definition might look like:

class CPhaserSolDataFile(CCP4File.CDataFile):
  QUALIFIERS = { 'mimeTypeName' : 'application/phaser-sol',
                 'mimeTypeDescription' : 'Phaser solution file',
                 'fileExtensions' : [ 'phaser_sol.pkl' ],
                 'fileContentClassName' : None,
                 'fileLabel' : 'phaser_sol',
                 'guiLabel' : 'Phaser solution file',
                 'toolTip' : "Possible solutions passed between runs of the Phaser program",

Note that the mimeTypeName must match the mime type in CCP4CustomMimeTypes, CCP4DbApi. The fileContentClassName is not set here but if i2 needs to read the file then a file content class that sub-classes CCP4File.CDataFileContent should be implemented and the class name provided here as a string.

A definition of the mime type looks like this:

      mimeType = CMimeType() = "application/phaser-sol"
      mimeType.description = "Phaser solution file"
      mimeType.fileExtensions = ['phaser_sol.pkl']
      mimeType.viewers = []
      mimeType.icon = 'PhaserSolDataFile'
      mimeType.className = 'PhaserSolDataFile'
      self.mimeTypes["application/phaser-sol"] = mimeType

Note that the className should cross reference the class name and by default the system assumes the file icon is the class name but an alternative can be provided here to be used in the file browser. Note that the className and icon have dropped the leading 'C'.

In the database file CCP4DbApi it is necessary to add the file type to three lists:

It is imperative to keep these three lists in sync! Also if a file type is added to the list and then subsequently not required the slot should not be deleted or reused - it should just be left as a 'dummy' the same as file type 14.

File content classes and contentFlag and subType flags

The classes derived from CDataFile such as CPdbDataFile and the various forms of mini MTZ such as CObsDataFile basically serve as a reference to the data file holding the file path and some additional flags discussed later. A fileContent class such as CPdbData or CMtzData can hold some data extracted from the file. A example of using these:

   >>> import CCP4XtalData
   >>> mtzData = CCP4XtalData.CMtzDataFile('/y/people/lizp/rnase25.mtz')
   >>> print type(f.fileContent)
   <class 'CCP4XtalData.CMtzData'>
   >>> print f.fileContent.cell
   {'a': '64.8970031738', 'c': '38.7919998169', 'b': '78.3229980469', 'beta': '1.57079632679', 'alpha': '1.57079632679', 'gamma': '1.57079632679'}

The file content classes are mostly being developed as required - please contact Liz if you need further functionality.

The file classes also contain two integer flags, contentFlag and subType that indicate something about the content of a selected file. The contentFlag is most important for mini MTZs to indicate the representation of the data (see miniMtzs for explanation).
CDataFile contentFlag allowed values
CObsDataFile.CONTENT_FLAG_IPAIR1Freidal's pairs of intensities
CObsDataFile.CONTENT_FLAG_FPAIR2Freidal's pairs of structure factors
CObsDataFile.CONTENT_FLAG_IMEAN3averaged intensities
CObsDataFile.CONTENT_FLAG_FMEAN4averaged structure factures
CPhsDataFile.CONTENT_FLAG_HL1Hendrickson-Lattmann coefficients
CPhsDataFile.CONTENT_FLAG_PHIFOM1phase and figure of merit
CMapCoeffsDataFile.CONTENT_FLAG_FPHI1structure facture and phase

The contentFlag is presently only used for mini-MTZs but, in principle, could be used to distiguish different forms of other data files. This flag is useful in automating handling of the mini-MTZs and ensuring programs are given data in a representation that they can handle.

The subType flag is presently used by mini-MTZ classes and CPdbDataFile to indicate the scientific content of the data.
CDataFile subType allowed values
CPdbDataFile.SUBTYPE_MODEL1working model
CPdbDataFile.SUBTYPE_FRAGMENT3structure fragment (e.g. ligand)
CPdbDataFile.SUBTYPE_HEAVY_ATOMS4heavy atoms
CObsDataFile.SUBTYPE_OBSERVED1observed data
CObsDataFile.SUBTYPE_DERIVED2derived data
CObsDataFile.SUBTYPE_REFERENCE3reference data
CPhsDataFile.SUBTYPE_UNBIASED1unbiased phases
CPhsDataFile.SUBTYPE_BIASED2biased phases
CMapCoeffsDataFile.SUBTYPE_NORMAL1normal map
CMapCoeffsDataFile.SUBTYPE_DIFFERENCE2difference map
CCootHistoryDataFile.SUBTYPE_INITIAL1Initialisation for Coot
CCootHistoryDataFile.SUBTYPE_HISTORY2Coot history file

This information can be used to ensure that the data is used appropriately. Obviously the subType can be tricky to define so its use in the code is intended to be flexible.

If these flags are set in a CDataFile class that is being saved to the database then the contentFlag and/or subType are saved to the database. It is very helpful therefore if CPluginScript.processOutputFiles() implementations set these flags for output files. It is also possible to set a default value for contentFlag or subType in a wrapper def file if the nature of the program output file is known in advance (but done here is probably less obvious and may not be maintained properly).

It is possible to select input files based on the contentFlag or subType but note the handling of these two sorts of parameters is different and potentially could be different again for different data file classes. The input file selection is specified by the qualifiers: requiredContentFlag and requiredSubType both of which are expected to be a list of integers.

Currently the requiredContentFlag is only appropriate for mini-MTZ classes and should be set if a program can only handle a limited set of the possible mini-MTZ representations and the GUI will then only allow selection of files containing the appropriate representation or containing data that can be converted to the appropriate representation.

The definition of subType is less clear-cut than the contentFlag especially for PDB files so when specifying a list of requiredSubType consider giving CPdbDataFile.SUBTYPE_UNKNOWN (i.e. 0) as the last value in the list as this serves as a 'wildcard' which will allow any PDB file to be selected but the GUI will set the default and the first choices on a drop-down list to be of the preferred subType.

In conclusion: in CPluginScript.processOutputFiles() set contentFlag and subType for the output file data classes that support these and in the def file set qualifers requiredContentFlag and requiredSubType for input file data classes.

Last modified: Thu Apr 23 10:21:59 BST 2015