Several recent papers have explored self-supervised learning methods for vision transformers (ViT). Key approaches include:
1. Masked prediction tasks that predict masked patches of the input image.
2. Contrastive learning using techniques like MoCo to learn representations by contrasting augmented views of the same image.
3. Self-distillation methods like DINO that distill a teacher ViT into a student ViT using different views of the same image.
4. Hybrid approaches that combine masked prediction with self-distillation, such as iBOT.
This document discusses methods for automated machine learning (AutoML) and optimization of hyperparameters. It focuses on accelerating the Nelder-Mead method for hyperparameter optimization using predictive parallel evaluation. Specifically, it proposes using a Gaussian process to model the objective function and perform predictive evaluations in parallel to reduce the number of actual function evaluations needed by the Nelder-Mead method. The results show this approach reduces evaluations by 49-63% compared to baseline methods.
Several recent papers have explored self-supervised learning methods for vision transformers (ViT). Key approaches include:
1. Masked prediction tasks that predict masked patches of the input image.
2. Contrastive learning using techniques like MoCo to learn representations by contrasting augmented views of the same image.
3. Self-distillation methods like DINO that distill a teacher ViT into a student ViT using different views of the same image.
4. Hybrid approaches that combine masked prediction with self-distillation, such as iBOT.
This document discusses methods for automated machine learning (AutoML) and optimization of hyperparameters. It focuses on accelerating the Nelder-Mead method for hyperparameter optimization using predictive parallel evaluation. Specifically, it proposes using a Gaussian process to model the objective function and perform predictive evaluations in parallel to reduce the number of actual function evaluations needed by the Nelder-Mead method. The results show this approach reduces evaluations by 49-63% compared to baseline methods.
5月22日、研究所内勉強会の担当回だった時の資料(一部変更)。
画像説明文生成や真相学習を利用した画像生成など、以下のオーラル発表5本を紹介。
1. Show and Tell: A Neural Image Caption Generator
2. Long-term Recurrent Convolutional Networks for Visual Recognition and Description
3. Deep Visual-Semantic Alignments for Generating Image Descriptions
4. Deep Neural Networks are Easilly Fooled: High Confidence Predictions for Unrecognizable Images
5. Understanding Deep Image Representation by Inverting Them
The Internet has been evolving. One of the major reasons IPv6 should be deployed now is to restore the end-to-end principle of the Internet. However, as the Internet has changed dramatically in the last decade, returning to its original form is very difficult. In this presentation, I will discuss what is happening today and how we can best sustain and improve the Internet.
The document introduces primitiv, a neural network toolkit that uses computation graphs and dynamic construction with lazy evaluation. It discusses different strategies for constructing computation graphs, including static, dynamic define-by-run, and dynamic with lazy evaluation. Primitiv uses the dynamic with lazy evaluation approach, which allows for interactive graph construction while also enabling just-in-time optimization. The document provides an overview of primitiv's design goals, which include being simple, compact, device-independent, supporting implicit minibatching, and allowing usage from multiple languages.
Neural Machine Translation via Binary Code PredictionYusuke Oda
Neural machine translation models tend to be heavy due to the softmax output layer requiring O(V) computation. This work proposes using binary code prediction as the output layer to reduce computation from O(V) to O(logV). Two improvements are introduced: a hybrid model combining softmax and binary layers, and applying error-correcting codes to make binary codes more robust. Experiments show the proposed models achieve comparable translation accuracy to softmax models while reducing output layer size by 10x and speeding up both training and testing, especially on CPUs where it is 10x faster.
Learning to Generate Pseudo-code from Source Code using Statistical Machine T...Yusuke Oda
This document summarizes a study on generating pseudo-code from source code using statistical machine translation techniques. The researchers introduced two frameworks: phrase-based machine translation and tree-to-string machine translation. Experiments were conducted on two corpora, with the tree-to-string approach modified to address issues with abstract syntax trees generating the best pseudo-code based on automatic and human evaluations. Generated pseudo-code was shown to help with code understanding tasks compared to source code alone.
Syntax-based Simultaneous Translation through Prediction of Unseen Syntactic ...Yusuke Oda
This study proposes two methods for syntax-based simultaneous translation: 1) predicting and using unseen syntactic constituents, and 2) waiting for translation to avoid reordering problems. Experimental results on English to Japanese translation show the proposed approach prevents decreases in translation accuracy for short phrases compared to baselines, and provides more robustness to reordering. However, constituent prediction accuracy remains low due to redundant constituents in the gold syntax. Future work includes improving prediction and using additional context features.
Tree-based Translation Models (『機械翻訳』§6.2-6.3)Yusuke Oda
This document discusses tree-based translation models including synchronous context free grammar (SCFG), synchronous tree substitution grammar (STSG), and synchronous parsing. It covers topics such as learning SCFG and STSG from parallel corpora, introducing syntax labels, decoding, and rescoring. Tree-to-string, string-to-tree, tree-to-tree translation models are discussed under the STSG framework. The Galley-Hopkins-Kinght-Marcu algorithm for extracting STSG rules is also summarized.
Pattern Recognition and Machine Learning: Section 3.3Yusuke Oda
The document discusses Bayesian linear regression. It introduces the parameter distribution by assuming a Gaussian prior distribution for the model parameters. This leads to a Gaussian posterior distribution. It then discusses the predictive distribution for new data points by marginalizing over the posterior distribution of the parameters. Finally, it introduces the concept of an equivalent kernel, which allows predictions to be written as a linear combination of the training targets using a kernel matrix rather than by calculating the model parameters.
4. 15/12/10 Copyright (C) 2015 by Yusuke Oda, AHC-Lab, IS, NAIST 4
フレーズ翻訳
● 単語列をグループ化、変換、並べ替え
– 現在主流の手法
彼 は 望遠鏡 で 女の子 を 見た
He sawa girlwitha telescope
He saw a girl with a telescope