Skip to content

Commit

Permalink
[feature] Setup API auto doc (#429)
Browse files Browse the repository at this point in the history
*Issue #, if available:*

*Description of changes:*
This PR sets up the new documentation generation mechanism, create new
API doc rst files, and modify existing Python code for doc files.

By submitting this pull request, I confirm that you can use, modify,
copy, and redistribute this contribution, under the terms of your
choice.

---------

Co-authored-by: Ubuntu <[email protected]>
Co-authored-by: Theodore Vasiloudis <[email protected]>
  • Loading branch information
3 people authored Sep 12, 2023
1 parent c63fb1f commit 1c06aac
Show file tree
Hide file tree
Showing 19 changed files with 342 additions and 24 deletions.
30 changes: 30 additions & 0 deletions .readthedocs.yaml
Original file line number Diff line number Diff line change
@@ -0,0 +1,30 @@
# .readthedocs.yaml
# Read the Docs configuration file
# See https://docs.readthedocs.io/en/stable/config-file/v2.html for details

# Required
version: 2

# Set the OS, Python version and other tools you might need
build:
os: ubuntu-22.04
tools:
python: "3.9"

# Build documentation in the "docs/" directory with Sphinx
sphinx:
configuration: docs/source/conf.py

# Optionally build your docs in additional formats such as PDF and ePub
# formats:
# - pdf
# - epub

# Optional but recommended, declare the Python requirements required
# to build your documentation
# See https://docs.readthedocs.io/en/stable/guides/reproducible-builds.html
python:
install:
- method: pip
path: .
- requirements: docs/requirements.txt
6 changes: 6 additions & 0 deletions docs/requirements.txt
Original file line number Diff line number Diff line change
@@ -0,0 +1,6 @@
sphinx==7.1.2
sphinx-rtd-theme==1.3.0
--extra-index-url https://download.pytorch.org/whl/cpu
torch==1.13.1+cpu
-f https://data.dgl.ai/wheels-internal/repo.html
dgl==1.0.4
13 changes: 13 additions & 0 deletions docs/source/_templates/classtemplate.rst
Original file line number Diff line number Diff line change
@@ -0,0 +1,13 @@
.. role:: hidden
:class: hidden-section
.. currentmodule:: {{ module }}


{{ name | underline}}

.. autoclass:: {{ name }}
:show-inheritance:
:members: prepare_data, get_node_feats, get_edge_feats, get_labels, forward, get_sparse_params,
get_general_dense_parameters, get_lm_dense_parameters, save_model, remove_saved_model,
save_topk_models, get_best_model_path, restore_model, fit, eval, infer, evaluate,
do_eval, compute_score, predict
62 changes: 62 additions & 0 deletions docs/source/api/graphstorm.customized.rst
Original file line number Diff line number Diff line change
@@ -0,0 +1,62 @@
.. _apicustomized:

customized model APIs
==========================

GraphStorm provides a set of APIs for users to integrate their own customized models with
the framework of GraphStorm, so that users' own models can leverage GraphStorm's easy-to-use
and distributed capabilities.

For how to modify users' own models, please refer to this :ref:`Use Your Own Model Tutorial
<use-own-models>`.

In general, there are three sets of APIs involved in programming customized models.

* Dataloaders: users need to extend GraphStorm's abstract node or edge dataloader to implement
their own graph samplers or mini_batch generators.
* Models: depending on specific GML tasks, users need to extend the corresponding ModelBase and
ModelInterface, and then implement the required abstract functions.
* Evaluators: if necessary, users can also extend the two evaluator templates to implement their
own performance evaluation method.

.. currentmodule:: graphstorm

Dataloaders
------------
.. autosummary::
:toctree: ../generated/
:nosignatures:
:template: classtemplate.rst

.. dataloading.AbsNodeDataLoader
.. dataloading.AbsEdgeDataLoader
Models
------------

.. autosummary::
:toctree: ../generated/
:nosignatures:
:template: classtemplate.rst

model.GSgnnModelBase
model.GSgnnNodeModelBase
model.GSgnnEdgeModelBase
model.GSgnnLinkPredictionModelBase
model.GSgnnNodeModelInterface
model.GSgnnEdgeModelInterface
model.GSgnnLinkPredictionModelInterface

Evaluators
------------

If users want to implement customized evaluators or evaluation methods, a best practice is to
extend the ``eval.GSgnnInstanceEvaluator`` class, and implement the abstract methods.

.. autosummary::
:toctree: ../generated/
:nosignatures:
:template: classtemplate.rst

eval.GSgnnInstanceEvaluator
eval.GSgnnLPEvaluator
32 changes: 32 additions & 0 deletions docs/source/api/graphstorm.dataloading.rst
Original file line number Diff line number Diff line change
@@ -0,0 +1,32 @@
.. _apidataloading:

graphstorm.dataloading
==========================

GraphStorm dataloading module includes a set of graph datasets and dataloaders for different
graph machine learning tasks.

.. currentmodule:: graphstorm.dataloading

DataSets
------------
.. autosummary::
:toctree: ../generated/
:nosignatures:
:template: classtemplate.rst

GSgnnNodeTrainData
GSgnnNodeInferData
GSgnnEdgeTrainData
GSgnnEdgeInferData

Dataloaders
------------
.. autosummary::
:toctree: ../generated/
:nosignatures:
:template: classtemplate.rst

GSgnnNodeDataLoader
GSgnnEdgeDataLoader
GSgnnLinkPredictionDataLoader
20 changes: 20 additions & 0 deletions docs/source/api/graphstorm.evaluator.rst
Original file line number Diff line number Diff line change
@@ -0,0 +1,20 @@
.. _apievaluator:

graphstorm.evaluator
=======================

GraphStorm evaluators provides built-in evaluation methods for different Graph Machine
Learning (GML).

.. currentmodule:: graphstorm.eval
.. autosummary::
:toctree: ../generated/
:nosignatures:
:template: classtemplate.rst

GSgnnLPEvaluator
GSgnnMrrLPEvaluator
GSgnnPerEtypeMrrLPEvaluator
GSgnnAccEvaluator
GSgnnRegressionEvaluator

20 changes: 20 additions & 0 deletions docs/source/api/graphstorm.inferer.rst
Original file line number Diff line number Diff line change
@@ -0,0 +1,20 @@
.. _apiinferer:

graphstorm.inferer
====================

GraphStorm inferers assemble the distributed inference pipeline for different tasks.

If possible, users should always use these inferers to avoid handling the distributed
processing and tasks.

.. currentmodule:: graphstorm.inference

.. autosummary::
:toctree: ../generated/
:nosignatures:
:template: classtemplate.rst

GSgnnLinkPredictionInfer
GSgnnNodePredictionInfer
GSgnnEdgePredictionInfer
41 changes: 41 additions & 0 deletions docs/source/api/graphstorm.model.rst
Original file line number Diff line number Diff line change
@@ -0,0 +1,41 @@
.. _apimodel:

graphstorm.model
=================

A GraphStorm model normally contains three components:

* Input layer: a set of modules to convert input data for different use cases,
e.g., embedding texture features.
* Encoder: a set of Graph Neural Network modules
* Decoder: a set of modules to convert results from encoders for different tasks,
e.g., classification, regression, or link prediction.

.. currentmodule:: graphstorm.model

Model input layers
-------------------
.. autosummary::
:toctree: ../generated/
:nosignatures:
:template: classtemplate.rst

GSNodeEncoderInputLayer
GSLMNodeEncoderInputLayer
GSPureLMNodeInputLayer

Model encoders and layers
--------------------------
.. autosummary::
:toctree: ../generated/
:nosignatures:
:template: classtemplate.rst

RelationalGCNEncoder
RelGraphConvLayer
RelationalGATEncoder
RelationalAttLayer
SAGEEncoder
SAGEConv
HGTEncoder
HGTLayer
21 changes: 21 additions & 0 deletions docs/source/api/graphstorm.rst
Original file line number Diff line number Diff line change
@@ -0,0 +1,21 @@
.. _apigraphstorm:

.. currentmodule:: graphstorm

graphstorm
============

The ``graphstorm`` package contains a set of functions for environment setup.
Users can directly use the following code to use these functions.

>>> import graphstorm as gs
>>> gs.initialize()
>>> gs.get_rank()

.. autosummary::
:toctree: ../generated/

gsf.initialize
gsf.get_feat_size
utils.get_rank
utils.get_world_size
42 changes: 42 additions & 0 deletions docs/source/api/graphstorm.trainer.rst
Original file line number Diff line number Diff line change
@@ -0,0 +1,42 @@
.. _apitrainer:

graphstorm.trainer
=====================

GraphStorm trainers assemble the distributed training pipeline for different tasks or
different training methods.

If possible, users should always use these trainers to avoid handling the distributed
processing and tasks.

.. currentmodule:: graphstorm.trainer


Base class
--------------
.. autosummary::
:toctree: ../generated/
:nosignatures:
:template: classtemplate.rst

GSgnnTrainer

Task classes
-----------------
.. autosummary::
:toctree: ../generated/
:nosignatures:
:template: classtemplate.rst

GSgnnLinkPredictionTrainer
GSgnnNodePredictionTrainer
GSgnnEdgePredictionTrainer

Method classes
-----------------
.. autosummary::
:toctree: ../generated/
:nosignatures:
:template: classtemplate.rst

GLEMNodePredictionTrainer
29 changes: 24 additions & 5 deletions docs/source/conf.py
Original file line number Diff line number Diff line change
Expand Up @@ -3,26 +3,45 @@
# For the full list of built-in configuration values, see the documentation:
# https://www.sphinx-doc.org/en/master/usage/configuration.html

# -- Path setup --------------------------------------------------------------

# If extensions (or modules to document with autodoc) are in another directory,
# add these directories to sys.path here. If the directory is relative to the
# documentation root, use os.path.abspath to make it absolute, like shown here.
#
import os
import sys

sys.path.insert(0, os.path.abspath("../../python"))

# -- Project information -----------------------------------------------------
# https://www.sphinx-doc.org/en/master/usage/configuration.html#project-information

import graphstorm

project = 'GraphStorm'
copyright = '2023, AGML team'
author = 'AGML team'
release = '0.1.2'
version = graphstorm.__version__
release = graphstorm.__version__

# -- General configuration ---------------------------------------------------
# https://www.sphinx-doc.org/en/master/usage/configuration.html#general-configuration

extensions = []

extensions = [
"sphinx.ext.duration",
"sphinx.ext.doctest",
"sphinx.ext.autodoc",
"sphinx.ext.autosummary",
"sphinx.ext.coverage",
"sphinx.ext.mathjax",
]
templates_path = ['_templates']
exclude_patterns = []



# -- Options for HTML output -------------------------------------------------
# https://www.sphinx-doc.org/en/master/usage/configuration.html#options-for-html-output

html_theme = 'furo'
html_theme = 'sphinx_rtd_theme'
html_static_path = ['_static']
3 changes: 1 addition & 2 deletions docs/source/configuration/configuration-run.rst
Original file line number Diff line number Diff line change
Expand Up @@ -118,8 +118,7 @@ GraphStorm provides a set of parameters to control how and where to save and res
- Yaml: ``restore_model_path: /model/checkpoint/``
- Argument: ``--restore-model-path /model/checkpoint/``
- Default value: This parameter must be provided if users want to restore a saved model.
- **restore_model_layers**: Specify which GraphStorm neural network layers to load. This argument is useful when a user wants to pre-train a GraphStorm model using link prediction and fine-tune the same model on a node or edge classification/regression task.
Currently, three neural network layers are supported, i.e., ``embed`` (input layer), ``gnn`` and ``decoder``. A user can select one or more layers to load.
- **restore_model_layers**: Specify which GraphStorm neural network layers to load. This argument is useful when a user wants to pre-train a GraphStorm model using link prediction and fine-tune the same model on a node or edge classification/regression task. Currently, three neural network layers are supported, i.e., ``embed`` (input layer), ``gnn`` and ``decoder``. A user can select one or more layers to load.
- Yaml: ``restore_model_path: embed``
- Argument: ``--restore-model-layers embed,gnn``
- Default value: Load all neural network layers
Expand Down
10 changes: 9 additions & 1 deletion docs/source/index.rst
Original file line number Diff line number Diff line change
Expand Up @@ -34,10 +34,18 @@ Welcome to the GraphStorm Documentation and Tutorials

.. toctree::
:maxdepth: 2
:caption: API Reference:
:caption: API Reference
:hidden:
:glob:

api/graphstorm
api/graphstorm.dataloading
api/graphstorm.model
api/graphstorm.trainer
api/graphstorm.inferer
api/graphstorm.evaluator
api/graphstorm.customized

GraphStorm is a graph machine learning (GML) framework designed for enterprise use cases. It simplifies the development, training and deployment of GML models on industry-scale graphs (measured in billons of nodes and edges) by providing scalable training and inference pipelines of GML models. GraphStorm comes with a collection of built-in GML models, allowing users to train a GML model with a single command, eliminating the need to write any code. Moreover, GraphStorm provides a wide range of configurations to customiz model implementations and training pipelines, enhancing model performance. In addition, GraphStorm offers a programming interface that enables users to train custom GML models in a distributed manner. Users can bring their own model implementations and leverage the GraphStorm training pipeline for scalability.

Getting Started
Expand Down
2 changes: 2 additions & 0 deletions python/graphstorm/__init__.py
Original file line number Diff line number Diff line change
Expand Up @@ -17,6 +17,8 @@
"""
__version__ = "0.2"

from . import gsf
from . import utils
from .utils import get_rank, get_world_size
from .gsf import initialize, get_feat_size
from .gsf import create_builtin_node_gnn_model
Expand Down
Loading

0 comments on commit 1c06aac

Please sign in to comment.