CCP4i2 Data Classes

Introduction
Containers

Current Data Classes

CCP4Annotation

CAnnotation Annotation text with user id and time
CAnnotationList A list of annotation
CAuthor Placeholder for bibliographic author
CBibReference Bibliographic reference
CBibReferenceGroup Set of bibliographic references for a task
CFont Simplified Qt font options
CHostName Computer name
CHostname
CMetaDataTag This class will extend list of enumerators if new value for string is entered
CMetaDataTagList
CServerGroup One or more compute servers used in "remote" running
CTime The time. Uses Python time module
CUserAddress User id and platform node
CUserId A user ID

CCP4CootData

CCootHistoryDataFile

CCP4Data

CBoolean A Boolean
CFloat A float
CFloatRange Two floats defining start and end of range
CFollowFromJob
CI2DataType
CInt An integer
CIntRange Two integers defining start and end of range
CJobStatus
CJobTitle
COneWord A single word string - no white space
CPatchSelection
CRange Base class for CIntRange and CFloatRange
CRangeSelection
CString A string
CTable
CUUID

CCP4File

CDataFile A data file - expected to have associated class for file contents
CDataFileContent Base class for classes holding file contents
CExePath
CExePathList
CExportedFile
CExportedFileList
CFileFunction List of recognised XML file functions
CFilePath A file path
CI2XmlDataFile A reference to an XML file with CCP4i2 Header
CI2XmlHeader Container for header info from XML file
CMmcifData Generic mmCIF data. This is intended to be a base class for other classes specific to coordinates, reflections or geometry data.
CMmcifDataFile A generic mmCIF format file. This is intended to be a base class for other classes specific to coordinates, reflections or geometry data.
CPostscriptDataFile An postscript format file
CProjectId The CCP4i2 database project id - a global unique id
CProjectName The name of a CCP4i project or directory alias
CSceneDataFile An xml format file for defining scene in CCP4mg.
CSearchPath
CSearchPathList
CTextDataFile A text data file
CVersion A (string) version number of the form n.m.i
CXmgrDataFile An xmgr format file. This is the input format for xmgrace, as output by scala or aimless
CXmlDataFile A reference to an XML file

CCP4MathsData

CAngle An angle
CEulerRotation
CMatrix33
CTransformation
CXyz
CXyzBox

CCP4ModelData

CAtomSelection
CBlastData
CBlastDataFile
CBlastItem
CChemComp
CContainsSeMet
CDictData
CDictDataFile A refmac dictionary file
CElement Chemical element
CEnsemble An ensemble of models. Typically, this would be a set of related PDB files, but models could also be xtal or EM maps. This should be indicated by the types entry. A single ensemble is a CList of structures.
CEnsembleList
CEnsemblePdbDataFile A PDB coordinate file containing ensemble of structures as 'NMR' models
CHhpredData
CHhpredDataFile
CHhpredItem
CHomolog
CHomologList
CMDLMolDataFile A molecule definition file (MDL)
CMonomer A monomer compound. ?smiles
CPdbData Contents of a PDB file - a subset with functionality for GUI
CPdbDataFile A PDB coordinate file
CPdbDataFileList
CPdbEnsembleItem
CRefmacNcs Definition of a NCS for Refmac - not implemented yet
CRefmacRigidDomain Definition of a rigid domain for Refmac
CResidueRange A residue range selection
CResidueRangeList A list of residue range selections
CSeqAlignDataFile A (multiple) sequence alignment file
CSeqDataFile A sequence file
CSeqDataFileList
CSequence A string of sequence one-letter codes Need to be able to parse common seq file formats Do we need to support alternative residues What about nucleic/polysach?
CSequenceAlignment An alignment of two or more sequences. Each sequence is obviously related to class CSequence, but will also contain gaps relevant to the alignment. We could implement the contents as a list of CSequence objects? The alignment is typically formatted in a file as consecutive or interleaved sequences.
CSequenceMeta
CTLSDataFile A refmac TLS file

CCP4PerformanceData

CAtomCountPerformance
CDataReductionPerformance
CExpPhasPerformance
CModelBuildPerformance
CPerformanceIndicator
CPhaseErrorPerformance
CRefinementPerformance
CSuperposePerformance
CTestObsConversionsPerformance

CCP4RefmacData

CRefmacAnomalousAtom
CRefmacRestraintsDataFile
CRefmacRigidGroupItem
CRefmacRigidGroupList
CRefmacRigidGroupSegment

CCP4XtalData

CAltSpaceGroup
CAltSpaceGroupList
CAnomalousColumnGroup Selection of F/I and AnomF/I columns from MTZ. Expected to be part of ab initio phasing dataset ( CDataset)
CAnomalousIntensityColumnGroup Selection of I and AnomI columns from MTZ. Expected to be part of ab initio phasing dataset ( CDataset)
CAnomalousScatteringElement Definition of a anomalous scattering element
CAsuComponent A component of the asymmetric unit. This is for use in MR, defining what we are searching for. There are similarities to CCrystalComponents and it should maybe be merged.
CAsuComponentList
CAtomicFormFactors Table of form factors for element v wavelength
CCell A unit cell
CCellAngle A cell angle
CCellLength A cell length
CColumnGroup Groups of columns in MTZ - probably from analysis by hklfile
CColumnGroupItem Definition of set of columns that form a 'group'
CColumnGroupList
CColumnType A list of recognised MTZ column types
CColumnTypeList A list of acceptable MTZ column types
CCrystalComponents A list of sequences, monomers and anomalous scatterers expected in a crystal
CCrystalCompositionLabel Serves as column header for CCrystalComposition - is a name for a composition model The composition model can be for one of three units
CCrystalCompositionTable A table of crystal components v. composition models
CCrystalName
CDataset The experimental data model for ab initio phasing
CDatasetList
CDatasetName
CExperimentalDataType Experimental data type e.g. native or peak
CFPairColumnGroup
CFSigFColumnGroup
CFormFactor The for factor (Fp and Fpp) for a giving element and wavelength
CFreeRColumnGroup
CFreeRDataFile
CGenericReflDataFile
CHLColumnGroup
CIPairColumnGroup
CISigIColumnGroup
CImageFile
CImosflmXmlDataFile An iMosflm data file
CImportUnmerged
CImportUnmergedList
CMapCoeffsDataFile
CMapColumnGroup
CMapDataFile A CCP4 Map file
CMergeMiniMtz
CMergeMiniMtzList
CMiniMtzDataFile
CMiniMtzDataFileList
CMmcifReflData Reflection data in mmCIF format
CMmcifReflDataFile A reflection file in mmCIF format
CMtzColumn An MTZ column with column label and column type
CMtzColumnGroup
CMtzColumnGroupType
CMtzData Some of the data contents of an MTZ file
CMtzDataFile An MTZ experimental data file
CMtzDataset
CObsDataFile
CPhaserSolDataFile
CPhasingGroup
CPhiFomColumnGroup
CPhsDataFile
CProgramColumnGroup A group of MTZ columns required for program input
CProgramColumnGroup0
CReindexOperator
CResolutionRange
CRunBatchRange
CRunBatchRangeList
CShelxFADataFile
CShelxLabel
CSpaceGroup A string holding the space group
CSpaceGroupCell Cell space group and parameters
CUnmergedDataContent
CUnmergedDataFile Handle MTZ, XDS and scalepack files. Allow wildcard filename
CUnmergedDataFileList
CUnmergedMtzDataFile
CWavelength Wavelength in Angstrom

Introduction

These Python data classes are used by by CCP4i2 and by compatible pipelines and wrappers. It is the objective that these classes cover all the crystallographic data and that CCP4i2 provides widgets suitable for each data class. The classes are in modules named CCP4Whatever and the classes are named CWhatever (the C is for CCP4 as distinct from Q for Qt). The key fuctionality of each class is:

It is intended that the classes should also provide useful scientific functionality.

All data classes are ultimately derived from CCP4Data.CData. There are 'simple' classes: CBoolean, CFloat, CInt, CString etc. which are derived from CCP4Data.CBaseData. These can be sub-classed but the ues of qualifiers is intended to minimise the need to sub-class. For example CInt has the optional qualifiers min and max which can be define the limits of allowed values. These limits are then used in the CInt.validate() method. For example in Python code qualifiers for CCellLength are set:

  class CCellLength(CFloat):
    QUALIFIERS = { 'min' : 0.0,
                   'toolTip' : 'Cell length in Angstrom' }

In this example the toolTip shows the other important use of qualifiers: to provide helpful information for the GUI.

'Complex' classes contain other classes - the contents are defined by a CONTENTS statement in the class:

class CCell(CData):
    CONTENTS = 
              {  'a' :     { 'class' : CCellLength },
                 'b' :     { 'class' : CCellLength },
                 'c' :     { 'class' : CCellLength },
                 'alpha' : { 'class' : CCellAngle },
                 'beta'  : { 'class' : CCellAngle },
                 'gamma' : { 'class' : CCellAngle }  }

Nothing else would be required to define a functional CCell class. The CData.build() method will build the required data structure for the class from this definition.

There is also support for lists and tables as CList and CTable which (unlike Python lists) must have all elements of the same data type which can be any CData class other than CList or CTable. These are both derived from CCollection which handles a subItem which is the definition of the type of the elements of the CList or CTable.

Containers

CCP4Container.CContainer is a sub-class of CCP4Data.CData that can hold a set of CCP4Data.CData objects that are defined at run time; the definition usually comes from an XML def file. Typically a CContainer will hold all of the data for one pipeline, wrapper or GUI. CContainers can contain other CContainers and it is recommended that the container for a pipeline or wrapper has three sub-containers called 'inputData', 'outputData' and 'parameters' to distinguish the different functions of the data. The content of the def file can be defined most easily using the defEd program.


Class list last updated: 14:53 18/Aug/16