當前位置：首頁 > 编程资源 > 编程问答 >内容正文

编程问答

show attend and tell 计算bleu分数（1到4）

發布時間：2024/10/6 编程问答 42 豆豆

生活随笔收集整理的這篇文章主要介紹了 show attend and tell 计算bleu分数（1到4）小編覺得挺不錯的,現在分享給大家,幫大家做個參考.

Calculate BLEU scores

參考：How to do calculate all bleu scores during evaluvaion #37
主要是參數的改變，默認計算的是BLEU4的分數，從源碼中也可以看出來

# Calculate BLEU-4 scores bleu4 = corpus_bleu(references, hypotheses) # weights = (1.0 / 1.0,) bleu1 = corpus_bleu(references, hypotheses, (1.0 / 1.0,)) bleu2 = corpus_bleu(references, hypotheses, (1.0/2.0, 1.0/2.0,)) bleu3 = corpus_bleu(references, hypotheses, (1.0/3.0, 1.0/3.0, 1.0/3.0,)) bleu1_4 = 'bleu1:' + str(bleu1) + '\n' + 'bleu2:' + str(bleu2) + '\n' + 'bleu3:' + str(bleu3) + '\n' + 'bleu4:' + str(bleu4) print(bleu1_4)

改變bleu4中weights=(0.25, 0.25, 0.25, 0.25)的值即可以得到相應的值

corpus_bleu源碼（有興趣可以全看，沒有的話只看參數和注釋即可）：

def corpus_bleu(list_of_references,hypotheses,weights=(0.25, 0.25, 0.25, 0.25),smoothing_function=None,auto_reweigh=False, ):"""Calculate a single corpus-level BLEU score (aka. system-level BLEU) for allthe hypotheses and their respective references.Instead of averaging the sentence level BLEU scores (i.e. macro-averageprecision), the original BLEU metric (Papineni et al. 2002) accounts forthe micro-average precision (i.e. summing the numerators and denominatorsfor each hypothesis-reference(s) pairs before the division).>>> hyp1 = ['It', 'is', 'a', 'guide', 'to', 'action', 'which',... 'ensures', 'that', 'the', 'military', 'always',... 'obeys', 'the', 'commands', 'of', 'the', 'party']>>> ref1a = ['It', 'is', 'a', 'guide', 'to', 'action', 'that',... 'ensures', 'that', 'the', 'military', 'will', 'forever',... 'heed', 'Party', 'commands']>>> ref1b = ['It', 'is', 'the', 'guiding', 'principle', 'which',... 'guarantees', 'the', 'military', 'forces', 'always',... 'being', 'under', 'the', 'command', 'of', 'the', 'Party']>>> ref1c = ['It', 'is', 'the', 'practical', 'guide', 'for', 'the',... 'army', 'always', 'to', 'heed', 'the', 'directions',... 'of', 'the', 'party']>>> hyp2 = ['he', 'read', 'the', 'book', 'because', 'he', 'was',... 'interested', 'in', 'world', 'history']>>> ref2a = ['he', 'was', 'interested', 'in', 'world', 'history',... 'because', 'he', 'read', 'the', 'book']>>> list_of_references = [[ref1a, ref1b, ref1c], [ref2a]]>>> hypotheses = [hyp1, hyp2]>>> corpus_bleu(list_of_references, hypotheses) # doctest: +ELLIPSIS0.5920...The example below show that corpus_bleu() is different from averagingsentence_bleu() for hypotheses>>> score1 = sentence_bleu([ref1a, ref1b, ref1c], hyp1)>>> score2 = sentence_bleu([ref2a], hyp2)>>> (score1 + score2) / 2 # doctest: +ELLIPSIS0.6223...:param list_of_references: a corpus of lists of reference sentences, w.r.t. hypotheses:type list_of_references: list(list(list(str))):param hypotheses: a list of hypothesis sentences:type hypotheses: list(list(str)):param weights: weights for unigrams, bigrams, trigrams and so on:type weights: list(float):param smoothing_function::type smoothing_function: SmoothingFunction:param auto_reweigh: Option to re-normalize the weights uniformly.:type auto_reweigh: bool:return: The corpus-level BLEU score.:rtype: float"""# Before proceeding to compute BLEU, perform sanity checks.p_numerators = Counter() # Key = ngram order, and value = no. of ngram matches.p_denominators = Counter() # Key = ngram order, and value = no. of ngram in ref.hyp_lengths, ref_lengths = 0, 0assert len(list_of_references) == len(hypotheses), ("The number of hypotheses and their reference(s) should be the " "same ")# Iterate through each hypothesis and their corresponding references.for references, hypothesis in zip(list_of_references, hypotheses):# For each order of ngram, calculate the numerator and# denominator for the corpus-level modified precision.for i, _ in enumerate(weights, start=1):p_i = modified_precision(references, hypothesis, i)p_numerators[i] += p_i.numeratorp_denominators[i] += p_i.denominator# Calculate the hypothesis length and the closest reference length.# Adds them to the corpus-level hypothesis and reference counts.hyp_len = len(hypothesis)hyp_lengths += hyp_lenref_lengths += closest_ref_length(references, hyp_len)# Calculate corpus-level brevity penalty.bp = brevity_penalty(ref_lengths, hyp_lengths)# Uniformly re-weighting based on maximum hypothesis lengths if largest# order of n-grams < 4 and weights is set at default.if auto_reweigh:if hyp_lengths < 4 and weights == (0.25, 0.25, 0.25, 0.25):weights = (1 / hyp_lengths,) * hyp_lengths# Collects the various precision values for the different ngram orders.p_n = [Fraction(p_numerators[i], p_denominators[i], _normalize=False)for i, _ in enumerate(weights, start=1)]# Returns 0 if there's no matching n-grams# We only need to check for p_numerators[1] == 0, since if there's# no unigrams, there won't be any higher order ngrams.if p_numerators[1] == 0:return 0# If there's no smoothing, set use method0 from SmoothinFunction class.if not smoothing_function:smoothing_function = SmoothingFunction().method0# Smoothen the modified precision.# Note: smoothing_function() may convert values into floats;# it tries to retain the Fraction object as much as the# smoothing method allows.p_n = smoothing_function(p_n, references=references, hypothesis=hypothesis, hyp_len=hyp_lengths)s = (w_i * math.log(p_i) for w_i, p_i in zip(weights, p_n))s = bp * math.exp(math.fsum(s))return s 與50位技術專家面對面20年技術見證，附贈技術全景圖

總結

以上是生活随笔為你收集整理的show attend and tell 计算bleu分数（1到4）的全部內容，希望文章能夠幫你解決所遇到的問題。

如果覺得生活随笔網站內容還不錯，歡迎將生活随笔推薦給好友。

上一篇： Flickr30k图像标注数据集下载及使
下一篇：安装gpu版本的torch