This page links to all terms that do not have listed external references.
- Abscissa
- Abstractive sentence summarization
- AdaBoost
- Adaptive learning rate
- Additive clustering
- Affine space
- Antonym
- Attention Mechanism
- Average pooling
- Bag-of-n-grams
- Bayesian optimization
- Bayesian Probabilistic Matrix Factorization (BPMF)
- Bidirectional LSTM
- Bidirectional Recurrent Neural Network (BRNN)
- Bilingual Evaluation Understudy (BLEU)
- Black-Box optimization
- Bounding box
- Categorical mixture model
- Chinese Restaurant Process
- Chunking
- Clustering stability
- Clustering
- Co-clustering
- Codebook collapse
- Collaborative filtering
- Collaborative Topic Regression (CTR)
- Community detection
- Conditional GAN
- Conditional Markov Models (CMMs)
- Confusion matrix
- Continuous-Bag-of-Words (CBOW)
- Contractive autoencoder (CAE)
- Convex optimization
- Convolution
- Coreference resolution
- Cosine similarity
- Covariance
- Cross-Entropy loss
- Denoising autoencoder
- Differential Topic Modeling
- Dirichlet process
- doc2vec
- Early stopping
- Expectation
- Exploding gradient problem
- Face detection
- Facet (disambiguation)
- fastText
- First-order information
- Gibbs sampling
- GoogLeNet
- Gradient descent
- Gradient
- Graph
- Grid search
- Hadamard product
- Hamming distance
- Hessian matrix
- Hidden Markov Models (HMMs)
- Hierarchical Dirichlet process (HDP)
- Hierarchical Latent Dirichlet allocation (hLDA)
- Hierarchical Softmax
- Hinge loss
- Hypernym
- Hyperparameter
- Hyponym
- Identity mapping
- Importance sampling
- Inceptionism
- Independent and Identically Distributed (i.i.d)
- Jaccard index (intersection over union)
- Jacobian matrix
- K-Means clustering
- Kullback-Leibler (KL) divergence
- Language segmentation
- Laplacian matrix
- Latent Dirichlet allocation (LDA)
- Leaky ReLU
- Learning rate
- Lexeme
- Likelihood
- Linear discriminant analysis (LDA)
- Loss function
- Market basket analysis
- Markov Chain Monte Carlo (MCMC)
- Max-margin loss
- Max Pooling
- Maximum A Posteriori (MAP) Estimation
- Maximum Likelihood Estimation (MLE)
- Meronym
- Minibatch Gradient Descent
- Mini-Batching
- Minimal matching distance
- Mixed-membership model
- Mode collapse
- Model averaging
- Model compression
- Moore-Penrose Pseudoinverse
- Multi-Armed Bandit
- Multilayer LSTM
- Multinomial distribution
- Narrow convolution
- Natural Language Processing
- NCHW
- Negative Sampling
- Nested Chinese Restaurant Process
- Neural network
- NHWC
- Nonparametric clustering
- Object detection
- Object localization
- One-dimensional convolution
- Optimization
- Padding (convolution)
- Paragraph vector
- Parameter budget
- Parametric clustering
- Peephole connection
- Perplexity
- Pertainym
- Pitman-Yor Topic Modeling (PYTM)
- Point Estimator
- Pointwise Mutual Information (PMI)
- Polysemy
- Positive Pointwise Mutual Information (PPMI)
- Principal Component Analysis (PCA)
- Probabilistic Latent Semantic Indexing (PLSI)
- Probabilistic Matrix Factorization (PMF)
- Pseudo-labeling
- Q-learning
- Rand Index
- Random Forest (RF)
- Random search
- Receiver Operating Characteristic (ROC)
- Recursive Neural Network
- Regression based latent factors (RLFM)
- Regularization
- RMSProp
- Second-order information
- Singular Value Decomposition (SVD)
- Skip-Gram
- Sparse autoencoder
- Spearman's Rank Correlation Coefficient
- Stacked autoencoder
- Stationary environment
- Stochastic block model (SBM)
- Stochastic Gradient Descent (SGD)
- Stochastic Gradient Variational Bayes (SGVB)
- Stochastic Optimization
- Stride (convolution)
- Support Vector Machine (SVM)
- Synset
- Temporal classification
- Test term
- Time-delayed neural network
- Time-delayed signal
- Troponym
- Underfitting
- UNK
- Vanishing gradient problem
- Variational Autoencoder (VAE)
- Wide convolution
- Word embedding
- word2phrase
- Wronskian matrix