Examples ******** This directory contains a curated set of examples for validating and exploring the main functionalities of **MHCXGraph**. Each example is driven by a JSON manifest file that defines the input structures, execution mode, selectors, and output path. Directory structure =================== .. code-block:: text examples/ ├── EXAMPLES.rst ├── input │ ├── pre-renumbered │ └── renumbered ├── manifests │ ├── manifest-minimal.json │ ├── manifest-multiple.json │ ├── manifest-pairwise.json │ └── manifest-screening.json └── results Overview ======== The core workflow of **MHCXGraph** is: .. code-block:: text run → (optional) heatmap Two main usage paths are available, depending on whether the analysis should be restricted to functionally relevant hotspot residues: .. list-table:: :header-rows: 1 :widths: 20 80 * - Path - When to use * - **A**: ``run`` → ``heatmap`` - General structural comparison without residue filtering. * - **B**: ``renumber`` → ``run`` (with selectors) → ``heatmap`` - Analysis restricted to MHC hotspot residues. This requires IMGT-standardized numbering. The ``renumber`` and ``heatmap`` commands are accessory modules. The ``renumber`` step is only required when using the predefined ``MHC1`` or ``MHC2`` selectors, as described in :ref:`hotspot_selectors_section`. Core processing =============== The ``run`` command is the main execution module of **MHCXGraph**. It is driven by a manifest file that defines the inputs, execution mode, and structural selectors. For a detailed description of the manifest structure and available configuration fields, see :ref:`manifest_section`. .. warning:: Input PDB files should not contain additional macromolecules, such as TCRs or antibodies, interacting with the MHC in a way that alters the exposure of surface residues. Their presence may affect relative solvent accessibility (RSA) calculations and graph connectivity, leading to inaccurate comparison results. Pairwise mode ------------- The ``pairwise`` mode performs an all-against-all comparison across the input dataset. Its main purpose is to quantify structural similarity between every possible pair of MHC structures. .. code-block:: bash MHCXGraph run manifests/manifest-pairwise.json Multiple mode ------------- The ``multiple`` mode performs a simultaneous comparison of a group of structures in a single run. It is useful for identifying structural features shared across all inputs. .. code-block:: bash MHCXGraph run manifests/manifest-multiple.json Screening mode -------------- The ``screening`` mode is optimized for one-against-many comparisons. It compares a reference structure against a directory of target structures in order to identify structural cross-reactive candidates. .. code-block:: bash MHCXGraph run manifests/manifest-screening.json Visualization with heatmap ========================== After a pairwise run, **MHCXGraph** can generate a similarity heatmap with hierarchical clustering. The similarity metric is a coverage-based index defined as the fraction of unique nodes in association graph frames relative to the total number of nodes in both input graphs. .. code-block:: bash MHCXGraph heatmap \ -i results/pairwise/PAIRWISE/ \ -o results/pairwise/ \ -n similarity_heatmap.png .. list-table:: :header-rows: 1 :widths: 15 85 * - Flag - Description * - ``-i`` - Input directory, corresponding to the ``PAIRWISE/`` folder generated by a pairwise run. * - ``-o`` - Output directory for the heatmap image. * - ``-n`` - Output file name. .. note:: This module requires the output of a completed pairwise run. Execute ``MHCXGraph run`` with a pairwise manifest before using ``MHCXGraph heatmap``. Output files ------------ In addition to the ``.png`` figure, the heatmap module generates several data files that are useful for downstream analysis: #. ``distance_matrix.csv`` Numerical representation of dissimilarity between every pair of proteins. #. ``component_count_matrix.csv`` Number of connected components shared between each pair of proteins. #. ``unique_nodes_graph_*.csv`` Detailed lists of residues involved in the shared nodes for each pairwise comparison. .. _hotspot_selectors_section: Using hotspot residue selectors =============================== By default, **MHCXGraph** processes the full structure. If you want to restrict the analysis to functionally relevant residues, such as those lining the peptide-binding groove or participating in TCR contact, you can use the predefined ``MHC1`` or ``MHC2`` selectors in the manifest. When using these selectors, you must first run ``renumber``. The residue indices in ``MHC1`` and ``MHC2`` follow the IMGT domain numbering scheme, so applying them to structures with non-standardized numbering may produce incorrect or incomplete results. Step 1, standardize numbering ----------------------------- .. code-block:: bash MHCXGraph renumber \ -i examples/input/pre-renumbered/ \ -o examples/input/renumbered/ \ -c mhci \ --suffix _renumbered .. list-table:: :header-rows: 1 :widths: 15 85 * - Flag - Description * - ``-i`` - Input directory containing the original structure files. * - ``-o`` - Output directory for the renumbered files. * - ``-c`` - Chain convention. Use ``mhci`` for MHC class I and ``mhcii`` for MHC class II. * - ``--suffix`` - Suffix appended to each output file name. Step 2, configure selectors in the manifest ------------------------------------------- Two predefined hotspot residue sets are available. ``MHC1`` ~~~~~~~~ Residues from MHC class I involved in peptide presentation and TCR recognition. This selector covers positions along the α1 and α2 helices and the floor of the peptide-binding groove, using IMGT numbering for chain ``A``. .. code-block:: json "MHC1": { "chains": ["C"], "residues": { "A": [18, 19, 42, 43, 44, 54, 55, 56, 58, 59, 61, 62, 63, 64, 65, 66, 68, 69, 70, 71, 72, 73, 75, 76, 79, 80, 83, 84, 89, 108, 109, 142, 143, 145, 146, 147, 148, 149, 150, 151, 152, 153, 154, 155, 156, 157, 158, 159, 161, 162, 163, 165, 166, 167, 169, 170, 171] } } ``MHC2`` ~~~~~~~~ Residues from MHC class II located at the peptide-binding interface, distributed across the α chain (chain ``A``) and β chain (chain ``B``). .. code-block:: json "MHC2": { "chains": ["C"], "residues": { "A": [37, 51, 52, 53, 55, 56, 58, 59, 60, 62, 63, 65, 66, 67, 69], "B": [56, 57, 59, 60, 61, 62, 63, 65, 66, 67, 68, 69, 70, 71, 72, 73, 74, 77, 78, 81] } } .. note:: Use ``MHC1`` for class I structures and ``MHC2`` for class II structures. Do not mix these selectors across incompatible chain organizations. Step 3, run using the renumbered input -------------------------------------- After renumbering, point the manifest to the renumbered directory and proceed normally: .. code-block:: bash MHCXGraph run manifests/manifest-pairwise.json