php 日期时间 取日期,从PHP中的文本中提取日期,时间和日期范围
我正在構(gòu)建一個本地事件日歷,它采用RSS提要和網(wǎng)站抓取并從中提取事件日期.
我之前已經(jīng)問過如何從PHP here中的文本中提取日期,并在MarcDefiant時獲得了一個很好的答案:
function parse_date_tokens($tokens) {
# only try to extract a date if we have 2 or more tokens
if(!is_array($tokens) || count($tokens) < 2) return false;
return strtotime(implode(" ", $tokens));
}
function extract_dates($text) {
static $patterns = Array(
'/^[0-9]+(st|nd|rd|th|)?$/i', # day
'/^(Jan(uary)?|Feb(ruary)?|Mar(ch)?|etc)$/i', # month
'/^20[0-9]{2}$/', # year
'/^of$/' #words
);
# defines which of the above patterns aren't actually part of a date
static $drop_patterns = Array(
false,
false,
false,
true
);
$tokens = Array();
$result = Array();
$text = str_word_count($text, 1, '0123456789'); # get all words in text
# iterate words and search for matching patterns
foreach($text as $word) {
$found = false;
foreach($patterns as $key => $pattern) {
if(preg_match($pattern, $word)) {
if(!$drop_patterns[$key]) {
$tokens[] = $word;
}
$found = true;
break;
}
}
if(!$found) {
$result[] = parse_date_tokens($tokens);
$tokens = Array();
}
}
$result[] = parse_date_tokens($tokens);
return array_filter($result);
}
# test
$texts = Array(
"The focus of the seminar, on Saturday 2nd February 2013 will be [...]",
"Valentines Special @ The Radisson, Feb 14th",
"On Friday the 15th of February, a special Hollywood themed [...]",
"Symposium on Childhood Play on Friday, February 8th",
"Hosting a craft workshop March 9th - 11th in the old [...]"
);
$dates = extract_dates(implode(" ", $texts));
echo "Dates: \n";
foreach($dates as $date) {
echo " " . date('d.m.Y H:i:s', $date) . "\n";
}
但是,該解決方案有一些缺點 – 首先,它無法匹配日期范圍.
我現(xiàn)在正在尋找一種更復(fù)雜的解決方案,可以從示例文本中提取日期,時間和日期范圍.
這是最好的方法嗎?看起來我正在靠回一系列正則表達(dá)式語句,一個接一個地運(yùn)行以捕獲這些情況.我無法看到更好的方法來捕捉日期范圍,但我知道必須有更好的方法來做到這一點.是否有任何庫只用于PHP中的日期解析?
根據(jù)要求,日期/日期范圍樣本
$dates = [
" Saturday 28th December",
"2013/2014",
"Friday 10th of January",
"Thursday 19th December",
" on Sunday the 15th December at 1 p.m",
"On Saturday December 14th ",
"On Saturday December 21st at 7.30pm",
"Saturday, March 21st, 9.30 a.m.",
"Jan-April 2014",
"January 21st - Jan 24th 2014",
"Dec 30th - Jan 3rd, 2014",
"February 14th-16th, 2014",
"Mon 14 - Wed 16 April, 12 - 2pm",
"Sun 13 April, 8pm",
"Mon 21 - Wed 23 April",
"Friday 25 April, 10 – 3pm",
"The focus of the seminar, on Saturday 2nd February 2013 will be [...]",
"Valentines Special @ The Radisson, Feb 14th",
"On Friday the 15th of February, a special Hollywood themed [...]",
"Symposium on Childhood Play on Friday, February 8th",
"Hosting a craft workshop March 9th - 11th in the old [...]"
];
我目前正在使用的功能(不是上述功能)大約90%準(zhǔn)確.它可以捕獲日期范圍,但如果還指定了時間則有困難.它使用正則表達(dá)式列表,非常復(fù)雜.
更新:2014年1月6日
我正在研究執(zhí)行此操作的代碼,使用一系列正則表達(dá)式的原始方法依次運(yùn)行.我認(rèn)為我接近一個可以從一段文本中提取幾乎任何日期/時間范圍/格式的工作解決方案.當(dāng)我完成后,我會在這里發(fā)布它作為答案.
總結(jié)
以上是生活随笔為你收集整理的php 日期时间 取日期,从PHP中的文本中提取日期,时间和日期范围的全部內(nèi)容,希望文章能夠幫你解決所遇到的問題。
- 上一篇: Windows中的字体映射关系
- 下一篇: Nginx主配置文件的优化-nginx主