• 検索結果がありません。

Block coordinate descent methods for obtaining vector representations for words

N/A
N/A
Protected

Academic year: 2021

シェア "Block coordinate descent methods for obtaining vector representations for words"

Copied!
2
0
0

読み込み中.... (全文を見る)

全文

(1)

Master’s Thesis

Block coordinate descent methods for obtaining vector representations for words

Guidance

Professor Nobuo YAMASHITA

Ryota KATSUKI

Department of Applied Mathematics and Physics Graduate School of Informatics

Kyoto University

K

YOTO UNIVER SIT

Y

F OU

ND E D 1897 KYOTO JAPAN

February 2018

(2)

Abstract

In recent years, distributed representations of words have been widely utilized in the field of natural language processing. The idea of these representations is to assign a low-dimensional vector to each word, considering similarities and analogies with other words. In particular, the Global Vectors for word representation (GloVe) has attracted much attention as a model for getting high-performance distributed representations. GloVe obtains representations for words by solving a certain large-scale optimization problem whose variables are a bunch of vectors representing all words in huge documents. The optimization problem is solved by the stochastic gradient descent method. However, the method does not fully exploit the special structure of GloVe, which makes the convergence slow. Moreover, even if it finds some solution, the solution may not be accurate enough.

In this paper, we first propose a block coordinate descent method (BCD) that exploits the structure of GloVe’s optimization problem. The objective function of the problem is squared sum of bilinear and linear functions. Thus it becomes a linear least squares problem when some variables are fixed. Since a solution of the linear least squares problem can be expressed explicitly, we can implement BCD efficiently. We show the global convergence of this method. Moreover, we perform numerical experiments, and show that the proposed method is faster than the existing ones. However, the performance of the distributed representations acquired by the proposed method is shown not to be very good. This is because the optimization problem in GloVe is a non-convex problem with a lot of local optima. The quality of distributed representations depends on the optimization method even if the optimal values are same.

In order to overcome this drawback, we also propose some improvements for the method after investigating the cause of this performance degradation. Moreover, assuming we know some word analogies in advance, we propose a new GloVe model that exploits such information.

Finally, we give some numerical results, showing that distributed representations obtained by

our improved method with the new GloVe model can achieve higher performance than the ones

obtained by the existing approaches.

参照

関連したドキュメント

In the study of properties of solutions of singularly perturbed problems the most important are the following questions: nding of conditions B 0 for the degenerate

A new method is suggested for obtaining the exact and numerical solutions of the initial-boundary value problem for a nonlinear parabolic type equation in the domain with the

The direct inspiration of this work is the recent work of Broughan and Barnett [5], who have demonstrated many properties of PIPs, giving bounds on the n-th PIP, a PIP counting

Using generating functions appearing in these integral representations, we give new Vacca and Ramanujan-type series for values of the generalized Euler constant function

In this section, we present some of the results obtained with the three-dimensional numerical simulations of the coupled fluid-biochemistry model described above for the prediction

The following result about dim X r−1 when p | r is stated without proof, as it follows from the more general Lemma 4.3 in Section 4..

We provide an efficient formula for the colored Jones function of the simplest hyperbolic non-2-bridge knot, and using this formula, we provide numerical evidence for the

Yin; Global existence and blow-up phenomena for an integrable two- component Camassa-Holm shallow water systems, J.. Liu; On the global existence and wave-breaking criteria for