diff --git a/examples/healthcare/application/Diabetic_Readmission_Prediction/README.md b/examples/healthcare/application/Diabetic_Readmission_Prediction/README.md
new file mode 100644
index 000000000..c58e6375a
--- /dev/null
+++ b/examples/healthcare/application/Diabetic_Readmission_Prediction/README.md
@@ -0,0 +1,45 @@
+<!--
+    Licensed to the Apache Software Foundation (ASF) under one
+    or more contributor license agreements.  See the NOTICE file
+    distributed with this work for additional information
+    regarding copyright ownership.  The ASF licenses this file
+    to you under the Apache License, Version 2.0 (the
+    "License"); you may not use this file except in compliance
+    with the License.  You may obtain a copy of the License at
+
+      http://www.apache.org/licenses/LICENSE-2.0
+
+    Unless required by applicable law or agreed to in writing,
+    software distributed under the License is distributed on an
+    "AS IS" BASIS, WITHOUT WARRANTIES OR CONDITIONS OF ANY
+    KIND, either express or implied.  See the License for the
+    specific language governing permissions and limitations
+    under the License.
+-->
+
+# Singa for Diabetic Readmission Prediction task
+
+## Diabetic Readmission
+
+Diabetic readmission is a significant concern in healthcare, with a substantial number of patients being readmitted to the hospital within a short period after discharge. This not only leads to increased healthcare costs but also poses a risk to patient well-being.
+
+Although diabetes is a manageable condition, early identification of patients at high risk of readmission remains a challenge. A reliable and efficient predictive model can help identify these patients, enabling healthcare providers to intervene early and prevent unnecessary readmissions.
+
+To address this issue, we use Singa to implement a machine learning model for predicting diabetic readmission. The dataset is from [BMC Medical Informatics and Decision-Making](https://bmcmedinformdecismak.biomedcentral.com/articles/10.1186/s12911-021-01423-y). Please download the dataset before running the scripts.
+
+
+## Structure
+
+* `data` includes the scripts for preprocessing Diabetic Readmission datasets.
+
+* `model` includes the MLP model construction codes by creating
+  a subclass of `Module` to wrap the neural network operations 
+  of each model.
+
+* `train_mlp.py` is the training script, which controls the training flow by
+  doing BackPropagation and SGD update.
+
+## Command
+```bash
+python train.py mlp diabetic
+```
diff --git a/examples/healthcare/application/Diabetic_Retinopathy_Classification/README.md b/examples/healthcare/application/Diabetic_Retinopathy_Classification/README.md
new file mode 100644
index 000000000..dfa88fd50
--- /dev/null
+++ b/examples/healthcare/application/Diabetic_Retinopathy_Classification/README.md
@@ -0,0 +1,51 @@
+<!--
+    Licensed to the Apache Software Foundation (ASF) under one
+    or more contributor license agreements.  See the NOTICE file
+    distributed with this work for additional information
+    regarding copyright ownership.  The ASF licenses this file
+    to you under the Apache License, Version 2.0 (the
+    "License"); you may not use this file except in compliance
+    with the License.  You may obtain a copy of the License at
+
+      http://www.apache.org/licenses/LICENSE-2.0
+
+    Unless required by applicable law or agreed to in writing,
+    software distributed under the License is distributed on an
+    "AS IS" BASIS, WITHOUT WARRANTIES OR CONDITIONS OF ANY
+    KIND, either express or implied.  See the License for the
+    specific language governing permissions and limitations
+    under the License.
+-->
+
+# Singa for Diabetic Retinopathy Classification
+
+## Diabetic Retinopathy
+
+Diabetic Retinopathy (DR) is a progressive eye disease caused by long-term diabetes, which damages the blood vessels in the retina, the light-sensitive tissue at the back of the eye. It typically develops in stages, starting with non-proliferative diabetic retinopathy (NPDR), where weakened blood vessels leak fluid or blood, causing swelling or the formation of deposits. If untreated, it can progress to proliferative diabetic retinopathy (PDR), characterized by the growth of abnormal blood vessels that can lead to severe vision loss or blindness. Symptoms may include blurred vision, dark spots, or difficulty seeing at night, although it is often asymptomatic in the early stages. Early diagnosis through regular eye exams and timely treatment, such as laser therapy or anti-VEGF injections, can help manage the condition and prevent vision impairment.
+
+The dataset has 5 groups characterized by the severity of Diabetic Retinopathy (DR).
+
+- 0: No DR
+- 1: Mild Non-Proliferative DR
+- 2: Moderate Non-Proliferative DR
+- 3: Severe Non-Proliferative DR
+- 4: Proliferative DR
+
+
+To mitigate the problem, we use Singa to implement a machine learning model to help with Diabetic Retinopathy  diagnosis. The dataset is from Kaggle https://www.kaggle.com/datasets/mohammadasimbluemoon/diabeticretinopathy-messidor-eyepac-preprocessed. Please download the dataset before running the scripts.
+
+## Structure
+
+* `data` includes the scripts for preprocessing DR image datasets.
+
+* `model` includes the CNN model construction codes by creating
+  a subclass of `Module` to wrap the neural network operations 
+  of each model.
+
+* `train_cnn.py` is the training script, which controls the training flow by
+  doing BackPropagation and SGD update.
+
+## Command
+```bash
+python train_cnn.py cnn diaret -dir pathToDataset
+```
diff --git a/examples/healthcare/application/Diabetic_Retinopathy_Classification/train.py b/examples/healthcare/application/Diabetic_Retinopathy_Classification/train.py
new file mode 100644
index 000000000..5ef41851a
--- /dev/null
+++ b/examples/healthcare/application/Diabetic_Retinopathy_Classification/train.py
@@ -0,0 +1,297 @@
+from singa import singa_wrap as singa
+from singa import device
+from singa import tensor
+from singa import opt
+import numpy as np
+import time
+import argparse
+import sys
+sys.path.append("../../..")
+
+from PIL import Image
+
+from healthcare.data import diaret
+from healthcare.models import diabetic_retinopthy_net
+
+np_dtype = {"float16": np.float16, "float32": np.float32}
+
+singa_dtype = {"float16": tensor.float16, "float32": tensor.float32}
+
+
+# Data augmentation
+def augmentation(x, batch_size):
+    xpad = np.pad(x, [[0, 0], [0, 0], [4, 4], [4, 4]], 'symmetric')
+    for data_num in range(0, batch_size):
+        offset = np.random.randint(8, size=2)
+        x[data_num, :, :, :] = xpad[data_num, :,
+                               offset[0]:offset[0] + x.shape[2],
+                               offset[1]:offset[1] + x.shape[2]]
+        if_flip = np.random.randint(2)
+        if (if_flip):
+            x[data_num, :, :, :] = x[data_num, :, :, ::-1]
+    return x
+
+
+# Calculate accuracy
+def accuracy(pred, target):
+    # y is network output to be compared with ground truth (int)
+    y = np.argmax(pred, axis=1)
+    a = y == target
+    correct = np.array(a, "int").sum()
+    return correct
+
+
+# Data partition according to the rank
+def partition(global_rank, world_size, train_x, train_y, val_x, val_y):
+    # Partition training data
+    data_per_rank = train_x.shape[0] // world_size
+    idx_start = global_rank * data_per_rank
+    idx_end = (global_rank + 1) * data_per_rank
+    train_x = train_x[idx_start:idx_end]
+    train_y = train_y[idx_start:idx_end]
+
+    # Partition evaluation data
+    data_per_rank = val_x.shape[0] // world_size
+    idx_start = global_rank * data_per_rank
+    idx_end = (global_rank + 1) * data_per_rank
+    val_x = val_x[idx_start:idx_end]
+    val_y = val_y[idx_start:idx_end]
+    return train_x, train_y, val_x, val_y
+
+
+# Function to all reduce NUMPY accuracy and loss from multiple devices
+def reduce_variable(variable, dist_opt, reducer):
+    reducer.copy_from_numpy(variable)
+    dist_opt.all_reduce(reducer.data)
+    dist_opt.wait()
+    output = tensor.to_numpy(reducer)
+    return output
+
+
+def resize_dataset(x, image_size):
+    num_data = x.shape[0]
+    dim = x.shape[1]
+    X = np.zeros(shape=(num_data, dim, image_size, image_size),
+                 dtype=np.float32)
+    for n in range(0, num_data):
+        for d in range(0, dim):
+            X[n, d, :, :] = np.array(Image.fromarray(x[n, d, :, :]).resize(
+                (image_size, image_size), Image.BILINEAR),
+                dtype=np.float32)
+    return X
+
+
+def run(global_rank,
+        world_size,
+        dir_path,
+        max_epoch,
+        batch_size,
+        model,
+        data,
+        sgd,
+        graph,
+        verbosity,
+        dist_option='plain',
+        spars=None,
+        precision='float32'):
+    # now CPU version only, could change to GPU device for GPU-support machines
+    dev = device.get_default_device()
+    dev.SetRandSeed(0)
+    np.random.seed(0)
+    if data == 'diaret':
+        train_x, train_y, val_x, val_y = diaret.load(dir_path=dir_path)
+    else:
+        print(
+            'Wrong dataset!'
+        )
+        sys.exit(0)
+
+    num_channels = train_x.shape[1]
+    image_size = train_x.shape[2]
+    data_size = np.prod(train_x.shape[1:train_x.ndim]).item()
+    num_classes = (np.max(train_y) + 1).item()
+
+    if model == 'cnn':
+        model = diabetic_retinopthy_net.create_model(num_channels=num_channels,
+                                                     num_classes=num_classes)
+    else:
+        print(
+            'Wrong model!'
+        )
+        sys.exit(0)
+
+    # For distributed training, sequential has better performance
+    if hasattr(sgd, "communicator"):
+        DIST = True
+        sequential = True
+    else:
+        DIST = False
+        sequential = False
+
+    if DIST:
+        train_x, train_y, val_x, val_y = partition(global_rank, world_size,
+                                                   train_x, train_y, val_x,
+                                                   val_y)
+
+    if model.dimension == 4:
+        tx = tensor.Tensor(
+            (batch_size, num_channels, model.input_size, model.input_size), dev,
+            singa_dtype[precision])
+    elif model.dimension == 2:
+        tx = tensor.Tensor((batch_size, data_size),
+                           dev, singa_dtype[precision])
+        np.reshape(train_x, (train_x.shape[0], -1))
+        np.reshape(val_x, (val_x.shape[0], -1))
+
+    ty = tensor.Tensor((batch_size,), dev, tensor.int32)
+    num_train_batch = train_x.shape[0] // batch_size
+    num_val_batch = val_x.shape[0] // batch_size
+    idx = np.arange(train_x.shape[0], dtype=np.int32)
+
+    # Attach model to graph
+    model.set_optimizer(sgd)
+    model.compile([tx], is_train=True, use_graph=graph, sequential=sequential)
+    dev.SetVerbosity(verbosity)
+
+    # Training and evaluation loop
+    for epoch in range(max_epoch):
+        start_time = time.time()
+        np.random.shuffle(idx)
+
+        if global_rank == 0:
+            print('Starting Epoch %d:' % (epoch))
+
+        # Training phase
+        train_correct = np.zeros(shape=[1], dtype=np.float32)
+        test_correct = np.zeros(shape=[1], dtype=np.float32)
+        train_loss = np.zeros(shape=[1], dtype=np.float32)
+
+        model.train()
+        for b in range(num_train_batch):
+            # if b % 100 == 0:
+            #     print ("b: \n", b)
+            # Generate the patch data in this iteration
+            x = train_x[idx[b * batch_size:(b + 1) * batch_size]]
+            if model.dimension == 4:
+                x = augmentation(x, batch_size)
+                if (image_size != model.input_size):
+                    x = resize_dataset(x, model.input_size)
+            x = x.astype(np_dtype[precision])
+            y = train_y[idx[b * batch_size:(b + 1) * batch_size]]
+
+            # Copy the patch data into input tensors
+            tx.copy_from_numpy(x)
+            ty.copy_from_numpy(y)
+
+            # Train the model
+            out, loss = model(tx, ty, dist_option, spars)
+            train_correct += accuracy(tensor.to_numpy(out), y)
+            train_loss += tensor.to_numpy(loss)[0]
+
+        if DIST:
+            # Reduce the evaluation accuracy and loss from multiple devices
+            reducer = tensor.Tensor((1,), dev, tensor.float32)
+            train_correct = reduce_variable(train_correct, sgd, reducer)
+            train_loss = reduce_variable(train_loss, sgd, reducer)
+
+        if global_rank == 0:
+            print('Training loss = %f, training accuracy = %f' %
+                  (train_loss, train_correct /
+                   (num_train_batch * batch_size * world_size)),
+                  flush=True)
+
+        # Evaluation phase
+        model.eval()
+        for b in range(num_val_batch):
+            x = val_x[b * batch_size:(b + 1) * batch_size]
+            if model.dimension == 4:
+                if (image_size != model.input_size):
+                    x = resize_dataset(x, model.input_size)
+            x = x.astype(np_dtype[precision])
+            y = val_y[b * batch_size:(b + 1) * batch_size]
+            tx.copy_from_numpy(x)
+            ty.copy_from_numpy(y)
+            out_test = model(tx)
+            test_correct += accuracy(tensor.to_numpy(out_test), y)
+
+        if DIST:
+            # Reduce the evaulation accuracy from multiple devices
+            test_correct = reduce_variable(test_correct, sgd, reducer)
+
+        # Output the evaluation accuracy
+        if global_rank == 0:
+            print('Evaluation accuracy = %f, Elapsed Time = %fs' %
+                  (test_correct / (num_val_batch * batch_size * world_size),
+                   time.time() - start_time),
+                  flush=True)
+
+    dev.PrintTimeProfiling()
+
+
+if __name__ == '__main__':
+    # Use argparse to get command config: max_epoch, model, data, etc., for single gpu training
+    parser = argparse.ArgumentParser(
+        description='Training using the autograd and graph.')
+    parser.add_argument(
+        'model',
+        choices=['cnn'],
+        default='cnn')
+    parser.add_argument('data',
+                        choices=['diaret'],
+                        default='diaret')
+    parser.add_argument('-p',
+                        choices=['float32', 'float16'],
+                        default='float32',
+                        dest='precision')
+    parser.add_argument('-dir',
+                        '--dir-path',
+                        default="/tmp/diaret",
+                        type=str,
+                        help='the directory to store the Diabetic Retinopathy dataset',
+                        dest='dir_path')
+    parser.add_argument('-m',
+                        '--max-epoch',
+                        default=300,
+                        type=int,
+                        help='maximum epochs',
+                        dest='max_epoch')
+    parser.add_argument('-b',
+                        '--batch-size',
+                        default=64,
+                        type=int,
+                        help='batch size',
+                        dest='batch_size')
+    parser.add_argument('-l',
+                        '--learning-rate',
+                        default=0.005,
+                        type=float,
+                        help='initial learning rate',
+                        dest='lr')
+    parser.add_argument('-g',
+                        '--disable-graph',
+                        default='True',
+                        action='store_false',
+                        help='disable graph',
+                        dest='graph')
+    parser.add_argument('-v',
+                        '--log-verbosity',
+                        default=0,
+                        type=int,
+                        help='logging verbosity',
+                        dest='verbosity')
+
+    args = parser.parse_args()
+
+    sgd = opt.SGD(lr=args.lr, momentum=0.9, weight_decay=1e-5,
+                  dtype=singa_dtype[args.precision])
+    run(0,
+        1,
+        args.dir_path,
+        args.max_epoch,
+        args.batch_size,
+        args.model,
+        args.data,
+        sgd,
+        args.graph,
+        args.verbosity,
+        precision=args.precision)
diff --git a/examples/healthcare/application/Hematologic_Disease/readme.md b/examples/healthcare/application/Hematologic_Disease/readme.md
new file mode 100644
index 000000000..d0f4902b9
--- /dev/null
+++ b/examples/healthcare/application/Hematologic_Disease/readme.md
@@ -0,0 +1,44 @@
+<!--
+    Licensed to the Apache Software Foundation (ASF) under one
+    or more contributor license agreements.  See the NOTICE file
+    distributed with this work for additional information
+    regarding copyright ownership.  The ASF licenses this file
+    to you under the Apache License, Version 2.0 (the
+    "License"); you may not use this file except in compliance
+    with the License.  You may obtain a copy of the License at
+
+      http://www.apache.org/licenses/LICENSE-2.0
+
+    Unless required by applicable law or agreed to in writing,
+    software distributed under the License is distributed on an
+    "AS IS" BASIS, WITHOUT WARRANTIES OR CONDITIONS OF ANY
+    KIND, either express or implied.  See the License for the
+    specific language governing permissions and limitations
+    under the License.
+-->
+# CNN demo model on BloodMnist dataset
+
+## About dataset
+Download address: https://drive.google.com/drive/folders/1Ze9qri1UtAsIRoI0SJ4YRpdt5kUUMBEn?usp=sharing
+
+The BloodMNIST , as a sub set of [MedMNIST](https://medmnist.com/), is based on a dataset of individual normal cells, captured from individuals without infection, hematologic or oncologic disease and free of any pharmacologic treatment at the moment of blood collection. 
+It contains a total of 17,092 images and is organized into 8 classes. 
+it is split with a ratio of 7:1:2 into training, validation and test set. 
+The source images with resolution 3×360×363 pixels are center-cropped into 3×200×200, and then resized into 3×28×28.
+
+8 classes of the dataset: 
+```python
+"0": "basophil",
+"1": "eosinophil",
+"2": "erythroblast",
+"3": "ig (immature granulocytes)",
+"4": "lymphocyte",
+"5": "monocyte",
+"6": "neutrophil",
+"7": "platelet"
+```
+
+## Command
+```bash
+python train_cnn.py cnn bloodmnist -dir pathToDataset
+```
diff --git a/examples/healthcare/application/Hematologic_Disease/run.sh b/examples/healthcare/application/Hematologic_Disease/run.sh
new file mode 100644
index 000000000..c4a321ede
--- /dev/null
+++ b/examples/healthcare/application/Hematologic_Disease/run.sh
@@ -0,0 +1,20 @@
+#
+# Licensed to the Apache Software Foundation (ASF) under one
+# or more contributor license agreements.  See the NOTICE file
+# distributed with this work for additional information
+# regarding copyright ownership.  The ASF licenses this file
+# to you under the Apache License, Version 2.0 (the
+# "License"); you may not use this file except in compliance
+# with the License.  You may obtain a copy of the License at
+#
+#     http://www.apache.org/licenses/LICENSE-2.0
+#
+# Unless required by applicable law or agreed to in writing, software
+# distributed under the License is distributed on an "AS IS" BASIS,
+# WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied.
+# See the License for the specific language governing permissions and
+# limitations under the License.
+#
+
+### bloodminist dataset
+python train_cnn.py cnn bloodminist -dir pathToDataset
diff --git a/examples/healthcare/application/Hematologic_Disease/train_cnn.py b/examples/healthcare/application/Hematologic_Disease/train_cnn.py
new file mode 100644
index 000000000..0f267cd5a
--- /dev/null
+++ b/examples/healthcare/application/Hematologic_Disease/train_cnn.py
@@ -0,0 +1,211 @@
+#
+# Licensed to the Apache Software Foundation (ASF) under one
+# or more contributor license agreements.  See the NOTICE file
+# distributed with this work for additional information
+# regarding copyright ownership.  The ASF licenses this file
+# to you under the Apache License, Version 2.0 (the
+# "License"); you may not use this file except in compliance
+# with the License.  You may obtain a copy of the License at
+#
+#   http://www.apache.org/licenses/LICENSE-2.0
+#
+# Unless required by applicable law or agreed to in writing,
+# software distributed under the License is distributed on an
+# "AS IS" BASIS, WITHOUT WARRANTIES OR CONDITIONS OF ANY
+# KIND, either express or implied.  See the License for the
+# specific language governing permissions and limitations
+# under the License.
+#
+
+import time
+from singa import singa_wrap as singa
+from singa import device
+from singa import tensor
+from singa import opt
+import numpy as np
+from tqdm import tqdm
+import argparse
+import sys
+sys.path.append("../../..")
+
+from healthcare.data import bloodmnist
+from healthcare.models import hematologic_net
+
+np_dtype = {"float16": np.float16, "float32": np.float32}
+singa_dtype = {"float16": tensor.float16, "float32": tensor.float32}
+
+
+def accuracy(pred, target):
+    """Compute recall accuracy.
+
+    Args:
+        pred (Numpy ndarray): Prediction array, should be in shape (B, C)
+        target (Numpy ndarray): Ground truth array, should be in shape (B, ) 
+
+    Return:
+        correct (Float): Recall accuracy
+    """
+    # y is network output to be compared with ground truth (int)
+    y = np.argmax(pred, axis=1)
+    a = (y[:,None]==target).sum()
+    correct = np.array(a, "int").sum()
+    return correct
+
+def run(dir_path,
+        max_epoch,
+        batch_size,
+        model,
+        data,
+        lr,
+        graph,
+        verbosity,
+        dist_option='plain',
+        spars=None,
+        precision='float32'):
+    # Start training
+    dev = device.create_cpu_device()
+    dev.SetRandSeed(0)
+    np.random.seed(0)
+    if data == 'bloodmnist':
+        train_dataset, val_dataset, num_class = bloodmnist.load(dir_path=dir_path)
+    else:
+        print(
+            'Wrong dataset!'
+        )
+        sys.exit(0)
+
+    if model == 'cnn':
+        model = hematologic_net.create_model(num_classes=num_class)
+    else:
+        print(
+            'Wrong model!'
+        )
+        sys.exit(0)
+
+    # Model configuration for CNN
+    # criterion = layer.SoftMaxCrossEntropy()
+    optimizer_ft = opt.Adam(lr)
+
+    tx = tensor.Tensor(
+        (batch_size, 3, model.input_size, model.input_size), dev,
+        singa_dtype[precision])
+    ty = tensor.Tensor((batch_size,), dev, tensor.int32)
+
+    num_train_batch = train_dataset.__len__() // batch_size
+    num_val_batch = val_dataset.__len__() // batch_size
+    idx = np.arange(train_dataset.__len__(), dtype=np.int32)
+
+    # Attach model to graph
+    model.set_optimizer(optimizer_ft)
+    model.compile([tx], is_train=True, use_graph=graph, sequential=False)
+    dev.SetVerbosity(verbosity)
+
+    # Training and evaluation loop
+    for epoch in range(max_epoch):
+        print(f'Epoch {epoch}:')
+
+        start_time = time.time()
+
+        train_correct = np.zeros(shape=[1], dtype=np.float32)
+        test_correct = np.zeros(shape=[1], dtype=np.float32)
+        train_loss = np.zeros(shape=[1], dtype=np.float32)
+
+        # Training part
+        model.train()
+        for b in tqdm(range(num_train_batch)):
+            # Extract batch from image list
+            x, y = train_dataset.batchgenerator(idx[b * batch_size:(b + 1) * batch_size],
+                batch_size=batch_size, data_size=(3, model.input_size, model.input_size))
+            x = x.astype(np_dtype[precision])
+
+            tx.copy_from_numpy(x)
+            ty.copy_from_numpy(y)
+
+            out, loss = model(tx, ty, dist_option, spars)
+            train_correct += accuracy(tensor.to_numpy(out), y)
+            train_loss += tensor.to_numpy(loss)[0]
+        print('Training loss = %f, training accuracy = %f' %
+                      (train_loss, train_correct /
+                       (num_train_batch * batch_size)))
+
+        # Validation part
+        model.eval()
+        for b in tqdm(range(num_val_batch)):
+            x, y = train_dataset.batchgenerator(idx[b * batch_size:(b + 1) * batch_size],
+                batch_size=batch_size, data_size=(3, model.input_size, model.input_size))
+            x = x.astype(np_dtype[precision])
+
+            tx.copy_from_numpy(x)
+            ty.copy_from_numpy(y)
+
+            out = model(tx)
+            test_correct += accuracy(tensor.to_numpy(out), y)
+
+        print('Evaluation accuracy = %f, Elapsed Time = %fs' %
+                      (test_correct / (num_val_batch * batch_size),
+                       time.time() - start_time))
+
+
+if __name__ == '__main__':
+    # Use argparse to get command config: max_epoch, model, data, etc., for single gpu training
+    parser = argparse.ArgumentParser(
+        description='Training using the autograd and graph.')
+    parser.add_argument(
+        'model',
+        choices=['cnn'],
+        default='cnn')
+    parser.add_argument('data',
+                        choices=['bloodmnist'],
+                        default='bloodmnist')
+    parser.add_argument('-p',
+                        choices=['float32', 'float16'],
+                        default='float32',
+                        dest='precision')
+    parser.add_argument('-dir',
+                        '--dir-path',
+                        default="/tmp/bloodmnist",
+                        type=str,
+                        help='the directory to store the bloodmnist dataset',
+                        dest='dir_path')
+    parser.add_argument('-m',
+                        '--max-epoch',
+                        default=100,
+                        type=int,
+                        help='maximum epochs',
+                        dest='max_epoch')
+    parser.add_argument('-b',
+                        '--batch-size',
+                        default=256,
+                        type=int,
+                        help='batch size',
+                        dest='batch_size')
+    parser.add_argument('-l',
+                        '--learning-rate',
+                        default=0.003,
+                        type=float,
+                        help='initial learning rate',
+                        dest='lr')
+    parser.add_argument('-g',
+                        '--disable-graph',
+                        default='True',
+                        action='store_false',
+                        help='disable graph',
+                        dest='graph')
+    parser.add_argument('-v',
+                        '--log-verbosity',
+                        default=0,
+                        type=int,
+                        help='logging verbosity',
+                        dest='verbosity')
+
+    args = parser.parse_args()
+
+    run(args.dir_path,
+        args.max_epoch,
+        args.batch_size,
+        args.model,
+        args.data,
+        args.lr,
+        args.graph,
+        args.verbosity,
+        precision=args.precision)
diff --git a/examples/healthcare/application/Malaria_Detection/readme.md b/examples/healthcare/application/Malaria_Detection/readme.md
new file mode 100644
index 000000000..00100b77f
--- /dev/null
+++ b/examples/healthcare/application/Malaria_Detection/readme.md
@@ -0,0 +1,44 @@
+<!--
+    Licensed to the Apache Software Foundation (ASF) under one
+    or more contributor license agreements.  See the NOTICE file
+    distributed with this work for additional information
+    regarding copyright ownership.  The ASF licenses this file
+    to you under the Apache License, Version 2.0 (the
+    "License"); you may not use this file except in compliance
+    with the License.  You may obtain a copy of the License at
+
+      http://www.apache.org/licenses/LICENSE-2.0
+
+    Unless required by applicable law or agreed to in writing,
+    software distributed under the License is distributed on an
+    "AS IS" BASIS, WITHOUT WARRANTIES OR CONDITIONS OF ANY
+    KIND, either express or implied.  See the License for the
+    specific language governing permissions and limitations
+    under the License.
+-->
+
+# Singa for Malaria Detection Task
+
+## Malaria
+
+Malaria is caused by parasites and could be transmitted through infected mosquitoes. There are about 200 million cases worldwide, and about 400,000 deaths per year, therefore, malaria does lots of harm to global health.
+
+Although Malaria is a curable disease, inadequate diagnostics make it harder to reduce mortality, as a result, a fast and reliable diagnostic test is a promising and effective way to fight malaria.
+
+To mitigate the problem, we use Singa to implement a machine learning model to help with Malaria diagnosis. The dataset is from Kaggle https://www.kaggle.com/datasets/miracle9to9/files1?resource=download. Please download the dataset before running the scripts.
+
+## Structure
+
+* `malaria.py` in the `healthcare/data` directory is the scripts for preprocessing Malaria image datasets.
+
+* `mararia_net.py` in the `healthcare/models` directory includes the CNN model construction codes by creating
+  a subclass of `Module` to wrap the neural network operations 
+  of each model.
+
+* `train_cnn.py` is the training script, which controls the training flow by
+  doing BackPropagation and SGD update.
+
+## Command
+```bash
+python train_cnn.py cnn malaria -dir pathToDataset
+```
\ No newline at end of file
diff --git a/examples/healthcare/application/Malaria_Detection/run.sh b/examples/healthcare/application/Malaria_Detection/run.sh
new file mode 100644
index 000000000..8e10e9924
--- /dev/null
+++ b/examples/healthcare/application/Malaria_Detection/run.sh
@@ -0,0 +1,20 @@
+#
+# Licensed to the Apache Software Foundation (ASF) under one
+# or more contributor license agreements.  See the NOTICE file
+# distributed with this work for additional information
+# regarding copyright ownership.  The ASF licenses this file
+# to you under the Apache License, Version 2.0 (the
+# "License"); you may not use this file except in compliance
+# with the License.  You may obtain a copy of the License at
+#
+#     http://www.apache.org/licenses/LICENSE-2.0
+#
+# Unless required by applicable law or agreed to in writing, software
+# distributed under the License is distributed on an "AS IS" BASIS,
+# WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied.
+# See the License for the specific language governing permissions and
+# limitations under the License.
+#
+
+### malaria dataset
+python train_cnn.py cnn malaria -dir pathToDataset
\ No newline at end of file
diff --git a/examples/healthcare/application/TED_CT_Detection/README.md b/examples/healthcare/application/TED_CT_Detection/README.md
index ee0e3b425..f23e2404a 100644
--- a/examples/healthcare/application/TED_CT_Detection/README.md
+++ b/examples/healthcare/application/TED_CT_Detection/README.md
@@ -2,10 +2,15 @@
 
 We have successfully applied the idea of prototype loss in various medical image classification task to improve performance, for example detection thyroid eye disease from CT images. Here we provide the implementation of the convolution prototype model in Singa. Due to data privacy, we are not able to release the CT image dataset used. The training scripts `./train.py` demonstrate how to apply this model on cifar-10 dataset.
 
+
 ## run
 
-At Singa project root directory `python examples/healthcare/application/TED_CT_Detection/train.py`
+1. Download `healthcare` directory then change to the `healthcare/application/TED_CT_Detection` directory.
+2. Command.
+```bash
+python train.py -dir pathToDataset
+```
 
 ## reference
 
-[Robust Classification with Convolutional Prototype Learning](https://arxiv.org/abs/1805.03438)
+[Robust Classification with Convolutional Prototype Learning](https://arxiv.org/abs/1805.03438)
\ No newline at end of file
diff --git a/examples/healthcare/application/TED_CT_Detection/model.py b/examples/healthcare/application/TED_CT_Detection/model.py
deleted file mode 100644
index 1c136d716..000000000
--- a/examples/healthcare/application/TED_CT_Detection/model.py
+++ /dev/null
@@ -1,119 +0,0 @@
-#
-# Licensed to the Apache Software Foundation (ASF) under one
-# or more contributor license agreements.  See the NOTICE file
-# distributed with this work for additional information
-# regarding copyright ownership.  The ASF licenses this file
-# to you under the Apache License, Version 2.0 (the
-# "License"); you may not use this file except in compliance
-# with the License.  You may obtain a copy of the License at
-#
-#     http://www.apache.org/licenses/LICENSE-2.0
-#
-# Unless required by applicable law or agreed to in writing, software
-# distributed under the License is distributed on an "AS IS" BASIS,
-# WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied.
-# See the License for the specific language governing permissions and
-# limitations under the License.
-#
-
-from singa import layer
-from singa import model
-import singa.tensor as tensor
-from singa import autograd
-from singa.tensor import Tensor
-
-
-class CPLayer(layer.Layer):
-    def __init__(self, prototype_count=2, temp=10.0):
-        super(CPLayer, self).__init__()
-        self.prototype_count = prototype_count
-        self.temp = temp
-
-    def initialize(self, x):
-        self.feature_dim = x.shape[1]
-        self.prototype = tensor.random(
-            (self.feature_dim, self.prototype_count), device=x.device
-        )
-
-    def forward(self, feat):
-        self.device_check(feat, self.prototype)
-        self.dtype_check(feat, self.prototype)
-
-        feat_sq = autograd.mul(feat, feat)
-        feat_sq_sum = autograd.reduce_sum(feat_sq, axes=[1], keepdims=1)
-        feat_sq_sum_tile = autograd.tile(feat_sq_sum, repeats=[1, self.feature_dim])
-
-        prototype_sq = autograd.mul(self.prototype, self.prototype)
-        prototype_sq_sum = autograd.reduce_sum(prototype_sq, axes=[0], keepdims=1)
-        prototype_sq_sum_tile = autograd.tile(prototype_sq_sum, repeats=feat.shape[0])
-
-        cross_term = autograd.matmul(feat, self.prototype)
-        cross_term_scale = Tensor(
-            shape=cross_term.shape, device=cross_term.device, requires_grad=False
-        ).set_value(-2)
-        cross_term_scaled = autograd.mul(cross_term, cross_term_scale)
-
-        dist = autograd.add(feat_sq_sum_tile, prototype_sq_sum_tile)
-        dist = autograd.add(dist, cross_term_scaled)
-
-        logits_coeff = (
-            tensor.ones((feat.shape[0], self.prototype.shape[1]), device=feat.device)
-            * -1.0
-            / self.temp
-        )
-        logits_coeff.requires_grad = False
-        logits = autograd.mul(logits_coeff, dist)
-
-        return logits
-
-    def get_params(self):
-        return {self.prototype.name: self.prototype}
-
-    def set_params(self, parameters):
-        self.prototype.copy_from(parameters[self.prototype.name])
-
-
-class CPL(model.Model):
-
-    def __init__(
-        self,
-        backbone: model.Model,
-        prototype_count=2,
-        lamb=0.5,
-        temp=10,
-        label=None,
-        prototype_weight=None,
-    ):
-        super(CPL, self).__init__()
-        # config
-        self.lamb = lamb
-        self.prototype_weight = prototype_weight
-        self.prototype_label = label
-
-        # layer
-        self.backbone = backbone
-        self.cplayer = CPLayer(prototype_count=prototype_count, temp=temp)
-        # optimizer
-        self.softmax_cross_entropy = layer.SoftMaxCrossEntropy()
-
-    def forward(self, x):
-        feat = self.backbone.forward(x)
-        logits = self.cplayer(feat)
-        return logits
-
-    def train_one_batch(self, x, y):
-        out = self.forward(x)
-        loss = self.softmax_cross_entropy(out, y)
-        self.optimizer(loss)
-        return out, loss
-
-    def set_optimizer(self, optimizer):
-        self.optimizer = optimizer
-
-
-def create_model(backbone, prototype_count=2, lamb=0.5, temp=10.0):
-    model = CPL(backbone, prototype_count=prototype_count, lamb=lamb, temp=temp)
-    return model
-
-
-__all__ = ["CPL", "create_model"]
diff --git a/examples/healthcare/application/TED_CT_Detection/train.py b/examples/healthcare/application/TED_CT_Detection/train.py
index 4994c2aaa..2b045fd93 100644
--- a/examples/healthcare/application/TED_CT_Detection/train.py
+++ b/examples/healthcare/application/TED_CT_Detection/train.py
@@ -26,13 +26,10 @@
 from PIL import Image
 
 import sys
+sys.path.append("../../..")
 
-sys.path.append(".")
-print(sys.path)
-
-import examples.cnn.model.cnn as cnn
-from examples.cnn.data import cifar10
-import model as cpl
+from healthcare.data import cifar10
+from healthcare.models import tedct_net
 
 
 def accuracy(pred, target):
@@ -60,6 +57,7 @@ def resize_dataset(x, image_size):
 
 def run(
     local_rank,
+    dir_path,
     max_epoch,
     batch_size,
     sgd,
@@ -68,18 +66,19 @@ def run(
     dist_option="plain",
     spars=None,
 ):
-    dev = device.create_cuda_gpu_on(local_rank)
+    # dev = device.create_cuda_gpu_on(local_rank)
+    dev = device.get_default_device()
     dev.SetRandSeed(0)
     np.random.seed(0)
 
-    train_x, train_y, val_x, val_y = cifar10.load()
+    train_x, train_y, val_x, val_y = cifar10.load(dir_path)
 
     num_channels = train_x.shape[1]
     data_size = np.prod(train_x.shape[1 : train_x.ndim]).item()
     num_classes = (np.max(train_y) + 1).item()
 
-    backbone = cnn.create_model(num_channels=num_channels, num_classes=num_classes)
-    model = cpl.create_model(backbone, prototype_count=10, lamb=0.5, temp=10)
+    backbone = tedct_net.create_cnn_model(num_channels=num_channels, num_classes=num_classes)
+    model = tedct_net.create_model(backbone, prototype_count=10, lamb=0.5, temp=10)
 
     if backbone.dimension == 4:
         tx = tensor.Tensor(
@@ -139,6 +138,12 @@ def run(
 
 if __name__ == "__main__":
     parser = argparse.ArgumentParser(description="Train a CPL model")
+    parser.add_argument('-dir',
+                        '--dir-path',
+                        default="/tmp/cifar-10-batches-py",
+                        type=str,
+                        help='the directory to store the dataset',
+                        dest='dir_path')
     parser.add_argument(
         "-m",
         "--max-epoch",
@@ -187,5 +192,11 @@ def run(
 
     sgd = opt.SGD(lr=args.lr, momentum=0.9, weight_decay=1e-5)
     run(
-        args.device_id, args.max_epoch, args.batch_size, sgd, args.graph, args.verbosity
-    )
+        args.device_id,
+        args.dir_path,
+        args.max_epoch,
+        args.batch_size,
+        sgd,
+        args.graph,
+        args.verbosity
+    )
\ No newline at end of file
diff --git a/examples/healthcare/data/cifar10.py b/examples/healthcare/data/cifar10.py
new file mode 100644
index 000000000..8e6c3f9ac
--- /dev/null
+++ b/examples/healthcare/data/cifar10.py
@@ -0,0 +1,89 @@
+#
+# Licensed to the Apache Software Foundation (ASF) under one
+# or more contributor license agreements.  See the NOTICE file
+# distributed with this work for additional information
+# regarding copyright ownership.  The ASF licenses this file
+# to you under the Apache License, Version 2.0 (the
+# "License"); you may not use this file except in compliance
+# with the License.  You may obtain a copy of the License at
+#
+#   http://www.apache.org/licenses/LICENSE-2.0
+#
+# Unless required by applicable law or agreed to in writing,
+# software distributed under the License is distributed on an
+# "AS IS" BASIS, WITHOUT WARRANTIES OR CONDITIONS OF ANY
+# KIND, either express or implied.  See the License for the
+# specific language governing permissions and limitations
+# under the License.
+#
+
+try:
+    import pickle
+except ImportError:
+    import cPickle as pickle
+
+import numpy as np
+import os
+import sys
+
+
+def load_dataset(filepath):
+    with open(filepath, 'rb') as fd:
+        try:
+            cifar10 = pickle.load(fd, encoding='latin1')
+        except TypeError:
+            cifar10 = pickle.load(fd)
+    image = cifar10['data'].astype(dtype=np.uint8)
+    image = image.reshape((-1, 3, 32, 32))
+    label = np.asarray(cifar10['labels'], dtype=np.uint8)
+    label = label.reshape(label.size, 1)
+    return image, label
+
+
+def load_train_data(dir_path='/tmp/cifar-10-batches-py', num_batches=5):  # need to save to specific local directories
+    labels = []
+    batchsize = 10000
+    images = np.empty((num_batches * batchsize, 3, 32, 32), dtype=np.uint8)
+    for did in range(1, num_batches + 1):
+        fname_train_data = dir_path + "/data_batch_{}".format(did)
+        image, label = load_dataset(check_dataset_exist(fname_train_data))
+        images[(did - 1) * batchsize:did * batchsize] = image
+        labels.extend(label)
+    images = np.array(images, dtype=np.float32)
+    labels = np.array(labels, dtype=np.int32)
+    return images, labels
+
+
+def load_test_data(dir_path='/tmp/cifar-10-batches-py'):  # need to save to specific local directories
+    images, labels = load_dataset(check_dataset_exist(dir_path + "/test_batch"))
+    return np.array(images, dtype=np.float32), np.array(labels, dtype=np.int32)
+
+
+def check_dataset_exist(dirpath):
+    if not os.path.exists(dirpath):
+        print(
+            'Please download the cifar10 dataset.'
+        )
+        sys.exit(0)
+    return dirpath
+
+
+def normalize(train_x, val_x):
+    mean = [0.4914, 0.4822, 0.4465]
+    std = [0.2023, 0.1994, 0.2010]
+    train_x /= 255
+    val_x /= 255
+    for ch in range(0, 2):
+        train_x[:, ch, :, :] -= mean[ch]
+        train_x[:, ch, :, :] /= std[ch]
+        val_x[:, ch, :, :] -= mean[ch]
+        val_x[:, ch, :, :] /= std[ch]
+    return train_x, val_x
+
+def load(dir_path):
+    train_x, train_y = load_train_data(dir_path)
+    val_x, val_y = load_test_data(dir_path)
+    train_x, val_x = normalize(train_x, val_x)
+    train_y = train_y.flatten()
+    val_y = val_y.flatten()
+    return train_x, train_y, val_x, val_y
\ No newline at end of file
diff --git a/examples/healthcare/data/diaret.py b/examples/healthcare/data/diaret.py
new file mode 100644
index 000000000..3e468c880
--- /dev/null
+++ b/examples/healthcare/data/diaret.py
@@ -0,0 +1,89 @@
+#
+# Licensed to the Apache Software Foundation (ASF) under one
+# or more contributor license agreements.  See the NOTICE file
+# distributed with this work for additional information
+# regarding copyright ownership.  The ASF licenses this file
+# to you under the Apache License, Version 2.0 (the
+# "License"); you may not use this file except in compliance
+# with the License.  You may obtain a copy of the License at
+#
+#     http://www.apache.org/licenses/LICENSE-2.0
+#
+# Unless required by applicable law or agreed to in writing, software
+# distributed under the License is distributed on an "AS IS" BASIS,
+# WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied.
+# See the License for the specific language governing permissions and
+# limitations under the License.
+#
+
+try:
+    import pickle
+except ImportError:
+    import cPickle as pickle
+
+import os
+import sys
+import random
+import numpy as np
+from PIL import Image
+
+
+# need to save to specific local directories
+def load_data(dir_path="/tmp/diaret", resize_size=(128, 128)):
+    dir_path = check_dataset_exist(dirpath=dir_path)
+    image_sets = {label: load_image_path(os.listdir(os.path.join(dir_path, label)))
+            for label in os.listdir(dir_path)}
+    images, labels = [], []
+    for label in os.listdir(dir_path):
+        image_names = load_image_path(os.listdir(os.path.join(dir_path, label)))
+        label_images = [np.array(Image.open(os.path.join(dir_path, label, img_name))\
+                .resize(resize_size).convert("RGB")).transpose(2, 0, 1)
+                for img_name in image_names]
+        images.extend(label_images)
+        labels.extend([int(label)] * len(label_images))
+
+    images = np.array(images, dtype=np.float32)
+    labels = np.array(labels, dtype=np.int32)
+    return images, labels
+
+
+def load_image_path(image_pths):
+    allowed_image_format = ['png', 'jpg', 'jpeg']
+    return list(filter(lambda pth: pth.rsplit('.')[-1].lower() in allowed_image_format,
+        image_pths))
+
+
+def check_dataset_exist(dirpath):
+    if not os.path.exists(dirpath):
+        print(
+            'Please download the Diabetic Retinopathy dataset first'
+        )
+        sys.exit(0)
+    return dirpath
+
+
+def normalize(train_x, val_x):
+    mean = [0.5339, 0.4180, 0.4460]  # mean for dataset
+    std = [0.3329, 0.2637, 0.2761]  # std for dataset
+    train_x /= 255
+    val_x /= 255
+    for ch in range(0, 2):
+        train_x[:, ch, :, :] -= mean[ch]
+        train_x[:, ch, :, :] /= std[ch]
+        val_x[:, ch, :, :] -= mean[ch]
+        val_x[:, ch, :, :] /= std[ch]
+    return train_x, val_x
+
+
+def train_test_split(x, y, val_ratio=0.2):
+    indices = list(range(len(x)))
+    val_indices = list(random.sample(indices, int(val_ratio*len(x))))
+    train_indices = list(set(indices) - set(val_indices))
+    return x[train_indices], y[train_indices], x[val_indices], y[val_indices]
+
+
+def load(dir_path):
+    x, y = load_data(dir_path=dir_path)
+    train_x, train_y, val_x, val_y = train_test_split(x, y)
+    train_x, val_x = normalize(train_x, val_x)
+    return train_x, train_y, val_x, val_y
diff --git a/examples/healthcare/models/diabetic_retinopthy_net.py b/examples/healthcare/models/diabetic_retinopthy_net.py
new file mode 100644
index 000000000..856adb7e7
--- /dev/null
+++ b/examples/healthcare/models/diabetic_retinopthy_net.py
@@ -0,0 +1,94 @@
+#
+# Licensed to the Apache Software Foundation (ASF) under one
+# or more contributor license agreements.  See the NOTICE file
+# distributed with this work for additional information
+# regarding copyright ownership.  The ASF licenses this file
+# to you under the Apache License, Version 2.0 (the
+# "License"); you may not use this file except in compliance
+# with the License.  You may obtain a copy of the License at
+#
+#     http://www.apache.org/licenses/LICENSE-2.0
+#
+# Unless required by applicable law or agreed to in writing, software
+# distributed under the License is distributed on an "AS IS" BASIS,
+# WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied.
+# See the License for the specific language governing permissions and
+# limitations under the License.
+#
+
+from singa import layer
+from singa import model
+
+
+class CNN(model.Model):
+
+    def __init__(self, num_classes=10, num_channels=1):
+        super(CNN, self).__init__()
+        self.num_classes = num_classes
+        self.input_size = 128
+        self.dimension = 4
+        self.conv1 = layer.Conv2d(num_channels, 32, 3, padding=0, activation="RELU")
+        self.conv2 = layer.Conv2d(32, 64, 3, padding=0, activation="RELU")
+        self.conv3 = layer.Conv2d(64, 64, 3, padding=0, activation="RELU")
+        self.linear1 = layer.Linear(128)
+        self.linear2 = layer.Linear(num_classes)
+        self.pooling1 = layer.MaxPool2d(2, 2, padding=0)
+        self.pooling2 = layer.MaxPool2d(2, 2, padding=0)
+        self.pooling3 = layer.MaxPool2d(2, 2, padding=0)
+        self.relu = layer.ReLU()
+        self.flatten = layer.Flatten()
+        self.softmax_cross_entropy = layer.SoftMaxCrossEntropy()
+        self.sigmoid = layer
+
+    def forward(self, x):
+        y = self.conv1(x)
+        y = self.pooling1(y)
+        y = self.conv2(y)
+        y = self.pooling2(y)
+        y = self.conv3(y)
+        y = self.pooling3(y)
+        y = self.flatten(y)
+        y = self.linear1(y)
+        y = self.relu(y)
+        y = self.linear2(y)
+        return y
+
+    def train_one_batch(self, x, y, dist_option, spars):
+        out = self.forward(x)
+        loss = self.softmax_cross_entropy(out, y)
+
+        if dist_option == 'plain':
+            self.optimizer(loss)
+        elif dist_option == 'half':
+            self.optimizer.backward_and_update_half(loss)
+        elif dist_option == 'partialUpdate':
+            self.optimizer.backward_and_partial_update(loss)
+        elif dist_option == 'sparseTopK':
+            self.optimizer.backward_and_sparse_update(loss,
+                                                      topK=True,
+                                                      spars=spars)
+        elif dist_option == 'sparseThreshold':
+            self.optimizer.backward_and_sparse_update(loss,
+                                                      topK=False,
+                                                      spars=spars)
+        return out, loss
+
+    def set_optimizer(self, optimizer):
+        self.optimizer = optimizer
+
+
+def create_model(**kwargs):
+    """Constructs a CNN model.
+
+    Args:
+        pretrained (bool): If True, returns a pre-trained model.
+
+    Returns:
+        The created CNN model.
+    """
+    model = CNN(**kwargs)
+
+    return model
+
+
+__all__ = ['CNN', 'create_model']
diff --git a/examples/healthcare/models/hematologic_net.py b/examples/healthcare/models/hematologic_net.py
new file mode 100644
index 000000000..fadd050e9
--- /dev/null
+++ b/examples/healthcare/models/hematologic_net.py
@@ -0,0 +1,124 @@
+#
+# Licensed to the Apache Software Foundation (ASF) under one
+# or more contributor license agreements.  See the NOTICE file
+# distributed with this work for additional information
+# regarding copyright ownership.  The ASF licenses this file
+# to you under the Apache License, Version 2.0 (the
+# "License"); you may not use this file except in compliance
+# with the License.  You may obtain a copy of the License at
+#
+#     http://www.apache.org/licenses/LICENSE-2.0
+#
+# Unless required by applicable law or agreed to in writing, software
+# distributed under the License is distributed on an "AS IS" BASIS,
+# WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied.
+# See the License for the specific language governing permissions and
+# limitations under the License.
+#
+
+from singa import layer
+from singa import model
+from singa import tensor
+from singa import opt
+from singa import device
+
+import numpy as np
+
+
+np_dtype = {"float16": np.float16, "float32": np.float32}
+
+singa_dtype = {"float16": tensor.float16, "float32": tensor.float32}
+
+
+class CNNModel(model.Model):
+    def __init__(self, num_classes):
+        super(CNNModel, self).__init__()
+        self.input_size = 28
+        self.dimension = 4
+        self.num_classes = num_classes
+
+        self.layer1 = layer.Conv2d(16, kernel_size=3, activation="RELU")
+        self.bn1 = layer.BatchNorm2d()
+        self.layer2 = layer.Conv2d(16, kernel_size=3, activation="RELU")
+        self.bn2 = layer.BatchNorm2d()
+        self.pooling2 = layer.MaxPool2d(kernel_size=2, stride=2)
+        self.layer3 = layer.Conv2d(64, kernel_size=3, activation="RELU")
+        self.bn3 = layer.BatchNorm2d()
+        self.layer4 = layer.Conv2d(64, kernel_size=3, activation="RELU")
+        self.bn4 = layer.BatchNorm2d()
+        self.layer5 = layer.Conv2d(64, kernel_size=3, padding=1, activation="RELU")
+        self.bn5 = layer.BatchNorm2d()
+        self.pooling5 = layer.MaxPool2d(kernel_size=2, stride=2)
+
+        self.flatten = layer.Flatten()
+
+        self.linear1 = layer.Linear(128)
+        self.linear2 = layer.Linear(128)
+        self.linear3 = layer.Linear(self.num_classes)
+
+        self.relu = layer.ReLU()
+
+        self.softmax_cross_entropy = layer.SoftMaxCrossEntropy()
+        self.dropout = layer.Dropout(ratio=0.3)
+
+    def forward(self, x):
+        x = self.layer1(x)
+        x = self.bn1(x)
+        x = self.layer2(x)
+        x = self.bn2(x)
+        x = self.pooling2(x)
+
+        x = self.layer3(x)
+        x = self.bn3(x)
+        x = self.layer4(x)
+        x = self.bn4(x)
+        x = self.layer5(x)
+        x = self.bn5(x)
+        x = self.pooling5(x)
+        x = self.flatten(x)
+        x = self.linear1(x)
+        x = self.relu(x)
+        x = self.linear2(x)
+        x = self.relu(x)
+        x = self.linear3(x)
+        return x
+
+    def set_optimizer(self, optimizer):
+        self.optimizer = optimizer
+
+    def train_one_batch(self, x, y, dist_option, spars):
+        out = self.forward(x)
+        loss = self.softmax_cross_entropy(out, y)
+
+        if dist_option == 'plain':
+            self.optimizer(loss)
+        elif dist_option == 'half':
+            self.optimizer.backward_and_update_half(loss)
+        elif dist_option == 'partialUpdate':
+            self.optimizer.backward_and_partial_update(loss)
+        elif dist_option == 'sparseTopK':
+            self.optimizer.backward_and_sparse_update(loss,
+                                                      topK=True,
+                                                      spars=spars)
+        elif dist_option == 'sparseThreshold':
+            self.optimizer.backward_and_sparse_update(loss,
+                                                      topK=False,
+                                                      spars=spars)
+        return out, loss
+
+
+def create_model(**kwargs):
+    """Constructs a CNN model.
+
+    Args:
+        pretrained (bool): If True, returns a pre-trained model.
+
+    Returns:
+        The created CNN model.
+    """
+    model = CNNModel(**kwargs)
+
+    return model
+
+
+__all__ = ['CNNModel', 'create_model']
\ No newline at end of file