apache · szha · Aug 9, 2021 · Aug 3, 2021 · Aug 3, 2021 · Aug 4, 2021
@@ -49,7 +49,7 @@ html: $(OBJ)
 	cp -n -r ../_static build/ || true;
 	sphinx-autogen build/api/*.rst build/api/**/*.rst   -t build/_templates/;
 	# make -C build linkcheck doctest html
-	make -C build html;
+	make -C build linkcheck html;
 	sed -i.bak 's/33\,150\,243/23\,141\,201/g'  build/_build/html/_static/material-design-lite-1.3.0/material.blue-deep_orange.min.css;
 	make update_github_link
 

@@ -29,7 +29,7 @@ In this tutorial, we will learn how to use MXNet to ONNX exporter on pre-trained
 
 To run the tutorial you will need to have installed the following python modules:
 - [MXNet >= 1.3.0](https://mxnet.apache.org/get_started)
-- [onnx]( https://github.com/onnx/onnx#installation) v1.2.1 (follow the install guide)
+- [onnx]( https://github.com/onnx/onnx#user-content-installation) v1.2.1 (follow the install guide)
 
 *Note:* MXNet-ONNX importer and exporter follows version 7 of ONNX operator set which comes with ONNX v1.2.1.
 

@@ -33,7 +33,7 @@ Gluon provides State of the Art models for many of the standard tasks such as Cl
 
 To complete this tutorial, you need:
 
-- [Build MXNet from source](https://mxnet.apache.org/get_started/ubuntu_setup#build-mxnet-from-source) with Python(Gluon) and C++ Packages
+- [Build MXNet from source](https://mxnet.apache.org/get_started/build_from_source) with Python(Gluon) and C++ Packages
 - Learn the basics about Gluon with [A 60-minute Gluon Crash Course](https://gluon-crash-course.mxnet.io/)
 
 

@@ -93,15 +93,15 @@ Output:
 
 As a rule of thumb, one should always implement custom layers by inheriting from `HybridBlock`. This allows to have more flexibility, and doesn't affect execution speed once hybridization is done. 
 
-Unfortunately, at the moment of writing this tutorial, NLP related layers such as [RNN](https://mxnet.apache.org/api/python/gluon/rnn.html#mxnet.gluon.rnn.RNN), [GRU](https://mxnet.apache.org/api/python/gluon/rnn.html#mxnet.gluon.rnn.GRU) and [LSTM](https://mxnet.apache.org/api/python/gluon/rnn.html#mxnet.gluon.rnn.LSTM) are directly inhereting from the `Block` class via common `_RNNLayer` class. That means that networks with such layers cannot be hybridized. But this might change in the future, so stay tuned.
+Unfortunately, at the moment of writing this tutorial, NLP related layers such as [RNN](../../../../api/gluon/rnn/index.rst#mxnet.gluon.rnn.RNN), [GRU](../../../../api/gluon/rnn/index.rst#mxnet.gluon.rnn.GRU) and [LSTM](../../../../api/gluon/rnn/index.rst#mxnet.gluon.rnn.LSTM) are directly inhereting from the `Block` class via common `_RNNLayer` class. That means that networks with such layers cannot be hybridized. But this might change in the future, so stay tuned.
 
 It is important to notice that hybridization has nothing to do with computation on GPU. One can train both hybridized and non-hybridized networks on both CPU and GPU, though hybridized networks would work faster. Though, it is hard to say in advance how much faster it is going to be.
 
 ## Adding a custom layer to a network
 
 While it is possible, custom layers are rarely used separately. Most often they are used with predefined layers to create a neural network. Output of one layer is used as an input of another layer.
 
-Depending on which class you used as a base one, you can use either [Sequential](https://mxnet.apache.org/api/python/gluon/gluon.html#mxnet.gluon.nn.Sequential) or [HybridSequential](https://mxnet.apache.org/api/python/gluon/gluon.html#mxnet.gluon.nn.HybridSequential) container to form a sequential neural network. By adding layers one by one, one adds dependencies of one layer's input from another layer's output. It is worth noting, that both `Sequential` and `HybridSequential` containers inherit from `Block` and `HybridBlock` respectively. 
+Depending on which class you used as a base one, you can use either [Sequential](../../../../api/gluon/nn/index.rst#mxnet.gluon.nn.Sequential) or [HybridSequential](../../../../api/gluon/nn/index.rst#mxnet.gluon.nn.HybridSequential) container to form a sequential neural network. By adding layers one by one, one adds dependencies of one layer's input from another layer's output. It is worth noting, that both `Sequential` and `HybridSequential` containers inherit from `Block` and `HybridBlock` respectively. 
 
 Below is an example of how to create a simple neural network with a custom layer. In this example, `NormalizationHybridLayer` gets as an input the output from `Dense(5)` layer and pass its output as an input to `Dense(1)` layer.
 
@@ -133,7 +133,7 @@ Output:
 
 ## Parameters of a custom layer
 
-Usually, a layer has a set of associated parameters, sometimes also referred as weights. This is an internal state of a layer. Most often, these parameters are the ones, that we want to learn during backpropogation step, but sometimes these parameters might be just constants we want to use during forward pass. The parameters are usually represented as [Parameter](https://mxnet.apache.org/api/python/gluon/gluon.html#mxnet.gluon.Parameter) class inside of Apache MXNet neural network.
+Usually, a layer has a set of associated parameters, sometimes also referred as weights. This is an internal state of a layer. Most often, these parameters are the ones, that we want to learn during backpropogation step, but sometimes these parameters might be just constants we want to use during forward pass. The parameters are usually represented as [Parameter](../../../../api/gluon/parameter.rst#gluon-parameter) class inside of Apache MXNet neural network.
 
 
 ```{.python .input}

@@ -246,11 +246,6 @@ net.export("lenet", epoch=1)
 
 ## Loading model parameters AND architecture from file
 
-### From a different frontend
-
-One of the main reasons to serialize model architecture into a JSON file is to load it from a different frontend like C, C++ or Scala. Here is a couple of examples:
-1. [Loading serialized Hybrid networks from C](https://github.com/apache/incubator-mxnet/blob/master/example/image-classification/predict-cpp/image-classification-predict.cc)
-2. [Loading serialized Hybrid networks from Scala](https://github.com/apache/incubator-mxnet/blob/master/scala-package/infer/src/main/scala/org/apache/mxnet/infer/ImageClassifier.scala)
 
 ### From Python
 

@@ -236,7 +236,7 @@ The network would learn to minimize the distance between the two `A`'s and maxim
 
 #### [CTC Loss](../../../../api/gluon/loss/index.rst#mxnet.gluon.loss.CTCLoss)
 
-CTC Loss is the [connectionist temporal classification loss](https://distill.pub/2017/ctc/) . It is used to train recurrent neural networks with variable time dimension. It learns the alignment and labelling of input sequences. It takes a sequence as input and gives probabilities for each timestep. For instance, in the following image the word is not well aligned with the 5 timesteps because of the different sizes of characters. CTC Loss finds for each timestep the highest probability e.g. `t1` presents with high probability a `C`. It combines the highest probapilities and returns the best path decoding. For an in-depth tutorial on how to use CTC-Loss in MXNet, check out this [example](https://github.com/apache/incubator-mxnet/tree/master/example/ctc).
+CTC Loss is the [connectionist temporal classification loss](https://distill.pub/2017/ctc/) . It is used to train recurrent neural networks with variable time dimension. It learns the alignment and labelling of input sequences. It takes a sequence as input and gives probabilities for each timestep. For instance, in the following image the word is not well aligned with the 5 timesteps because of the different sizes of characters. CTC Loss finds for each timestep the highest probability e.g. `t1` presents with high probability a `C`. It combines the highest probapilities and returns the best path decoding.
 
 ![ctc_loss](/_static/ctc_loss.png)
 

@@ -483,6 +483,6 @@ Summary
 In this notebook, we have shown how to train a GNMT model on IWSLT 2015
 English-Vietnamese using Gluon NLP toolkit. The complete training script
 can be found
-`here <https://github.com/dmlc/gluon-nlp/blob/master/scripts/nmt/train_gnmt.py>`__.
+`here <https://github.com/dmlc/gluon-nlp/blob/v0.x/scripts/machine_translation/train_gnmt.py>`__.
 The command to reproduce the result can be seen in the `nmt scripts
 page <http://gluon-nlp.mxnet.io/scripts/index.html#machine-translation>`__.
@@ -21,7 +21,7 @@ In this tutorial, you will learn how to use the [Gluon Fit API](https://cwiki.ap
 
 With the Fit API, you can train a deep learning model with a minimal amount of code. Just specify the network, loss function and the data you want to train on. You don't need to worry about the boiler plate code to loop through the dataset in batches (often called as 'training loop'). Advanced users can train with bespoke training loops, and many of these use cases will be covered by the Fit API.
 
-To demonstrate the Fit API, you will train an image classification model using the [ResNet-18](https://arxiv.org/abs/1512.03385) neural network architecture. The model will be trained using the [Fashion-MNIST dataset](https://research.zalando.com/welcome/mission/research-projects/fashion-mnist/).
+To demonstrate the Fit API, you will train an image classification model using the [ResNet-18](https://arxiv.org/abs/1512.03385) neural network architecture. The model will be trained using the [Fashion-MNIST dataset](https://github.com/zalandoresearch/fashion-mnist).
 
 ## Prerequisites
 
@@ -44,7 +44,7 @@ ctx = [mx.gpu(i) for i in range(gpu_count)] if gpu_count > 0 else mx.cpu()
 
 ## Dataset
 
-[Fashion-MNIST](https://research.zalando.com/welcome/mission/research-projects/fashion-mnist/) dataset consists of fashion items divided into ten categories: t-shirt/top, trouser, pullover, dress, coat, sandal, shirt, sneaker, bag and ankle boot.
+[Fashion-MNIST](https://github.com/zalandoresearch/fashion-mnist) dataset consists of fashion items divided into ten categories: t-shirt/top, trouser, pullover, dress, coat, sandal, shirt, sneaker, bag and ankle boot.
 
 - It has 60,000 grayscale images of size 28 * 28 for training.
 - It has 10,000 grayscale images of size 28 * 28 for testing/validation.

@@ -17,7 +17,7 @@
 
 # Normalization Blocks
 
-When training deep neural networks there are a number of techniques that are thought to be essential for model convergence. One important area is deciding how to initialize the parameters of the network. Using techniques such as [Xavier](https://mxnet.apache.org/api/python/optimization/optimization.html#mxnet.initializer.Xavier) initialization, we can can improve the gradient flow through the network at the start of training. Another important technique is normalization: i.e. scaling and shifting certain values towards a distribution with a mean of 0 (i.e. zero-centered) and a standard distribution of 1 (i.e. unit variance). Which values you normalize depends on the exact method used as we'll see later on.
+When training deep neural networks there are a number of techniques that are thought to be essential for model convergence. One important area is deciding how to initialize the parameters of the network. Using techniques such as [Xavier](../../../../../api/initializer/index.rst#mxnet.initializer.Xavier) initialization, we can can improve the gradient flow through the network at the start of training. Another important technique is normalization: i.e. scaling and shifting certain values towards a distribution with a mean of 0 (i.e. zero-centered) and a standard distribution of 1 (i.e. unit variance). Which values you normalize depends on the exact method used as we'll see later on.
 
 <p align="center">
     <img src="./imgs/data_normalization.jpeg" alt="drawing" width="500"/>

@@ -44,9 +44,9 @@ print('Time for converting to numpy: %f sec' % (time() - start))
 
 From the timings above, it seems as if converting to numpy takes lot more time than multiplying two large matrices. That doesn't seem right.
 
-This is because, in MXNet, all operations are executed asynchronously. So, when `nd.dot(x, x)` returns, the matrix multiplication is not complete, it has only been queued for execution. However, [asnumpy](https://mxnet.apache.org/api/python/ndarray/ndarray.html?highlight=asnumpy#mxnet.ndarray.NDArray.asnumpy) has to wait for the result to be calculated in order to convert it to numpy array on CPU, hence taking a longer time. Other examples of 'blocking' operations include [asscalar](https://mxnet.apache.org/api/python/ndarray/ndarray.html?highlight=asscalar#mxnet.ndarray.NDArray.asscalar) and [wait_to_read](https://mxnet.apache.org/api/python/ndarray/ndarray.html?highlight=wait_to_read#mxnet.ndarray.NDArray.wait_to_read).
+This is because, in MXNet, all operations are executed asynchronously. So, when `nd.dot(x, x)` returns, the matrix multiplication is not complete, it has only been queued for execution. However, [asnumpy](../../../api/legacy/ndarray/ndarray.rst#mxnet.ndarray.NDArray.asnumpy) has to wait for the result to be calculated in order to convert it to numpy array on CPU, hence taking a longer time. Other examples of 'blocking' operations include [asscalar](../../../api/legacy/ndarray/ndarray.rst#mxnet.ndarray.NDArray.asscalar) and [wait_to_read](../../../api/legacy/ndarray/ndarray.rst#mxnet.ndarray.NDArray.wait_to_read).
 
-While it is possible to use [NDArray.waitall()](https://mxnet.apache.org/api/python/ndarray/ndarray.html?highlight=waitall#mxnet.ndarray.waitall) before and after operations to get running time of operations, it is not a scalable method to measure running time of multiple sets of operations, especially in a [Sequential](https://mxnet.apache.org/api/python/gluon/gluon.html?highlight=sequential#mxnet.gluon.nn.Sequential) or hybridized network.
+While it is possible to use [NDArray.waitall()](../../../api/legacy/ndarray/ndarray.rst#mxnet.ndarray.waitall) before and after operations to get running time of operations, it is not a scalable method to measure running time of multiple sets of operations, especially in a [Sequential](../../../api/gluon/nn/index.rst#mxnet.gluon.nn.Sequential) or hybridized network.
 
 ## The correct way to profile
 

@@ -135,6 +135,12 @@ def wrapper(*args, **kwargs):
             # remove unused reference
             new_fn_doc = new_fn_doc.replace(
                 '.. [1] Wikipedia page: https://en.wikipedia.org/wiki/Trapezoidal_rule', '')
+        elif obj_name == "i0":
+            # replace broken link
+            new_fn_doc = new_fn_doc.replace(
+                '.. [3] http://kobesearch.cpan.org/htdocs/Math-Cephes/Math/Cephes.html',
+                '.. [3] https://metacpan.org/pod/distribution/Math-Cephes/lib/Math/Cephes.pod \
+                    #i0:-Modified-Bessel-function-of-order-zero')
         setattr(fallback_mod, obj_name, get_func(onp_obj, new_fn_doc))
     else:
         setattr(fallback_mod, obj_name, onp_obj)

@@ -55,7 +55,7 @@ def export_model(sym, params, in_shapes=None, in_types=np.float32,
     """Exports the MXNet model file, passed as a parameter, into ONNX model.
     Accepts both symbol,parameter objects as well as json and params filepaths as input.
     Operator support and coverage -
-    https://github.com/apache/incubator-mxnet/tree/v1.x/python/mxnet/onnx#operator-support-matrix
+    https://github.com/apache/incubator-mxnet/tree/v1.x/python/mxnet/onnx#user-content-operator-support-matrix
 
     Parameters
     ----------

diff --git a/src/io/iter_mnist.cc b/src/io/iter_mnist.cc
@@ -258,11 +258,7 @@ class MNISTIter: public IIterator<TBlobBatch> {
 DMLC_REGISTER_PARAMETER(MNISTParam);
 
 MXNET_REGISTER_IO_ITER(MNISTIter)
-.describe(R"code(Iterating on the MNIST dataset.
-
-One can download the dataset from http://yann.lecun.com/exdb/mnist/
-
-)code" ADD_FILELINE)
+.describe("Iterating on the MNIST dataset." ADD_FILELINE)
 .add_arguments(MNISTParam::__FIELDS__())
 .add_arguments(PrefetcherParam::__FIELDS__())
 .set_body([]() {

diff --git a/src/operator/svm_output.cc b/src/operator/svm_output.cc
@@ -90,7 +90,7 @@ MXNET_REGISTER_OP_PROPERTY(SVMOutput, SVMOutputProp)
 .describe(R"code(Computes support vector machine based transformation of the input.
 
 This tutorial demonstrates using SVM as output layer for classification instead of softmax:
-https://github.com/dmlc/mxnet/tree/master/example/svm_mnist.
+https://github.com/dmlc/mxnet/tree/v1.x/example/svm_mnist.
 
 )code")
 .add_argument("data", "NDArray-or-Symbol", "Input data for SVM transformation.")

diff --git a/src/operator/tensor/matrix_op.cc b/src/operator/tensor/matrix_op.cc
@@ -991,7 +991,7 @@ NNVM_REGISTER_OP(_backward_squeeze)
 NNVM_REGISTER_OP(depth_to_space)
 .describe(R"code(Rearranges(permutes) data from depth into blocks of spatial data.
 Similar to ONNX DepthToSpace operator:
-https://github.com/onnx/onnx/blob/master/docs/Operators.md#DepthToSpace.
+https://github.com/onnx/onnx/blob/master/docs/Operators.md#user-content-depthtospace.
 The output is a new tensor where the values from depth dimension are moved in spatial blocks
 to height and width dimension. The reverse of this operation is ``space_to_depth``.
 
@@ -1043,7 +1043,7 @@ Example::
 NNVM_REGISTER_OP(space_to_depth)
 .describe(R"code(Rearranges(permutes) blocks of spatial data into depth.
 Similar to ONNX SpaceToDepth operator:
-https://github.com/onnx/onnx/blob/master/docs/Operators.md#SpaceToDepth
+https://github.com/onnx/onnx/blob/master/docs/Operators.md#user-content-spacetodepth
 The output is a new tensor where the values from height and width dimension are
 moved to the depth dimension. The reverse of this operation is ``depth_to_space``.
 .. math::