Ke (Kevin) Wu, PhD Rensselaer Polytechnic Institute, 5.29.2016 wkcoke.work@gmail.com This program implements four pretraining algorithms that have been studied in the paper "Efficient Node-By-Node Greedy Deep Learning for Interpretable Feature Representation", submitted to ECML06. The four algorithms are supervised pretraining, unsupervised pretraining (autoencoder), greedy-by-node (GN) and greedy-by-class-by-node (GCN). This program is designed to solely deal with a cleaned dataset, that means the user is responsible for creating the training X, training response, test X and test response. All the data file should in numpy format using np.save(), please refer to the example file A main file should be written by user. There is one example "main_cancer.py” The parameters are set using a configuration file with flexible format. //***************************************************************************// For the parameter set: method: Defines how the upper-level structure, for the current code, the value can be either "super" (for supervised pretraining), or "normal" (for the other three methods) opt_method : defines the optimization routine, the four algorithms have different way of training a single layer. "unsup" is for autoencoder, "super" is for supervised pretraining, "greedy" is for GN and "greedy_class" is for GCN. As for optimization algorithm, SGD is the only optimization method implemented. So the combinations are: method: super normal normal normal opt_method: super unsup greedy greedy_class option_list is used for choosing a certain subset of classes. For example, one can train only on 7 and 9 by define 7,9. This function is enabled through the DataWrapper Module. alpha : learning rate, please refer to the paper epoch_limit : the epoches for pretraining. Geometry : defines the internal network structure (no input and output layer) FT: Y for turning on fine tuning, N for no output layer and fine-tuning. AF: amnesia factor, default to be 1.0, please refer to the paper. alpha_ft: learning rate for the fine-tuning and logistic regression layer (final layer) epoch_limit_ft: epochs for the fine-tuning batchsize: batch_size for fine-tuning and logistic regression layer (final layer) There are examples in the folder, using the Wisconsin Cancer data. //***************************************************************************// Note: Based on a limit number of heuristic test, the total number of updates for a single-internal-node network should next be less than 4/alpha. If a large network is used for a small data set, user should increase the epochs. //***************************************************************************//