必补不可的动漫神剧:Conditional Random Field（条件随机场）

来源：百度文库编辑：偶看新闻时间：2024/04/30 09:15:40

Conditional random field

From Wikipedia, the free encyclopedia

Jump to: navigation, search

A conditional random field (CRF) is a type of discriminative probabilistic model most often used for the labeling or parsing of sequential data, such as natural language text or biological sequences.

[hide]

1 Description
- 1.1 Relationship to hidden Markov models
- 1.2 Higher-order CRFs and semi-Markov CRFs
2 Software
3 See also
4 References
5 External links

[edit] Description

Much like a Markov random field, a CRF is an undirected graphical model in which each vertex represents a random variable whose distribution is to be inferred, and each edge represents a dependency between two random variables. For the current discussion, assume input sequence X represents sequence of observations and Y represents a hidden (or unknown) state variable that needs to be inferred given the observations. In a CRF, the distribution of each discrete random variable Y in the graph is conditioned on an input sequence X.

In principle, the layout of the graph of random variables Y can be arbitrary; most often, however, the Y_i are structured to form a chain, with an edge between each Y_{i − 1} and Y_i. As well as having a simple interpretation of the Y_i as "labels" for each element in the input sequence, this layout admits efficient algorithms for model training, learning the conditional distributions between the Y_i and feature functions from some corpus of training data, inference, determining the probability of a given label sequence Y given X, and decoding, determining the most likely label sequence Y given X.

The conditional dependency of each Y_i on X is defined through a fixed set of feature functions of the form f(i,Y_{i − 1},Y_i,X), which can informally be thought of as measurements on the input sequence that partially determine the likelihood of each possible value for Y_i. The model assigns each feature a numerical weight and combines them to determine the probability of a certain value for Y_i.

[edit] Relationship to hidden Markov models

CRFs have many of the same applications as conceptually simpler hidden Markov models (HMMs), but relax certain assumptions about the input and output sequence distributions. An HMM can loosely be understood as a CRF with very specific feature functions that use constant probabilities to model state transitions and emissions. Conversely, a CRF can loosely be understood as a generalization of an HMM that makes the constant transition probabilities into arbitrary functions that vary across the positions in the sequence of hidden states, depending on the input sequence.

Notably in contrast to HMMs, CRFs can contain any number of feature functions, the feature functions can inspect the entire input sequence X at any point during inference, and the range of the feature functions need not have a probabilistic interpretation.

The well-known forward-backward and Viterbi algorithms for HMMs have direct analogues for CRFs, with the same asymptotic running times. The training step, which determines a weight for each feature function, is somewhat more complex; generally, there is no closed-form solution for the optimal assignment of weights, so it must be found using numerical optimization techniques. Common techniques for this include gradient descent algorithms and Quasi-Newton method, such as the L-BFGS algorithm.

[edit] Higher-order CRFs and semi-Markov CRFs

CRFs can be extended into higher order models by making each Y_i dependent on a fixed number o of previous variables Y_{i − o},...,Y_{i − 1}. Training and inference are only practical for small values of o (such as ),^{[citation needed]} since their computational cost increases exponentially with o. Large-margin models for structured prediction, such as the structured Support Vector Machine can be seen as an alternative training procedure to CRFs.

There exists another generalization of CRFs, the semi-Markov conditional random field (semi-CRF), which models variable-length segmentations of the label sequence Y. This provides much of the power of higher-order CRFs to model long-range dependencies of the Y_i, at a reasonable computational cost.

[edit] Software

This is a partial list of software that implement CRF related tools.

MALLET (Java)
ABNER (Java)
MinorThird (Java)
Kevin Murphy's MATLAB CRF code (Matlab)
Sunita Sarawagi's CRF package (Java)
HCRF library (including CRF and LDCRF) (C++, Matlab)
CRFSuite Fast CRF implementation (C)
Xcrf for Xml data (Java)
CRF++ (C++)
sgd: An LGPL C++ library implementing Stochastic gradient descent with application to learning CRF and Support vector machine
FlexCRFs (including a parallel implementation) (C++)
JProGraM (Java)

[edit] See also

Graphical model

[edit] References

Lafferty, J., McCallum, A., Pereira, F.: Conditional random fields: Probabilistic models for segmenting and labeling sequence data. In: Proc. 18th International Conf. on Machine Learning, Morgan Kaufmann, San Francisco, CA (2001) 282–289
McCallum, A.: Efficiently inducing features of conditional random fields. In: Proc. 19th Conference on Uncertainty in Artificial Intelligence. (2003)
Sha, F., Pereira, F.: Shallow parsing with conditional random fields. Technical Report MS-CIS-02-35, University of Pennsylvania (2003)
Wallach, H.M.: Conditional random fields: An introduction. Technical Report MS-CIS-04-21, University of Pennsylvania (2004)
Sutton, C., McCallum, A.: An Introduction to Conditional Random Fields for Relational Learning. In "Introduction to Statistical Relational Learning". Edited by Lise Getoor and Ben Taskar. MIT Press. (2006) Online PDF
Klinger, R., Tomanek, K.: Classical Probabilistic Models and Conditional Random Fields. Algorithm Engineering Report TR07-2-013, Department of Computer Science, Dortmund University of Technology, December 2007. ISSN 1864-4503. Online PDF

[edit] External links

An annotated bibliography by Hanna M. Wallach

conditional的译法 conditional density propagation如何翻译？函数标识符 random 实例 CONDITIONAL TRANSGENIC TECHNOLOGY怎么解释？谢谢随机函数random()的算法？ java中的random方法问题 random英语中什么意思 Math.random()产生0吗英文的条件Conditional语法是怎么用？？？？ c++中的RANDOM()函数怎么使用? C语言中 random() 函数怎么用？ TC的random是什么意思？怎么用？ random()与srand()的关系是什么？劲舞团Random music叫什么名字 C语言的问题,关于random()函数 random怎么在C++不能用呐？？请问这个ATI RANDOM 9550 是什么 random house webster's college dictionary random(6),kbhit(),free(),graphresult(),grapherrorcode(),sound(),nosound()? java中Random方法包含在哪个包中 javascript里面Math.random用来取随几图片的问题怎么样可以有更多RAM(random access memory) 请问这篇参考文献在哪里可以找到：Modelling the persistence of conditional variances .net(c#)用Random能否产生限制位数的随机数,比如说六位?