當前位置：首頁 > 运维知识 > 数据库 >内容正文

数据库

python批量查询数据库_Python + MySQL 批量查询百度收录

發(fā)布時間：2024/10/12 数据库 20 豆豆

生活随笔收集整理的這篇文章主要介紹了 python批量查询数据库_Python + MySQL 批量查询百度收录小編覺得挺不錯的,現(xiàn)在分享給大家,幫大家做個參考.

做SEO的同學，經(jīng)常會遇到幾百或幾千個站點，然后對于收錄情況去做分析的情況

那么多余常用的一些工具在面對幾千個站點需要去做收錄分析的時候，那么就顯得不是很合適。

在此特意分享給大家一個批量查詢百度收錄狀況的代碼

使用 Python + MySQL(MariaDB) 配合使用

import pymysql

from urllib import request

import re

import time

import os,sys

# 數(shù)據(jù)操作類

class DataExec:

# 定義私有屬性

# 數(shù)據(jù)庫名稱

db = "domain"

dt = "bdshoulu"

# 數(shù)據(jù)庫登錄信息

hostName = "localhost"

userName = "root"

password = "pwd"

# 構(gòu)造方法

def __init__(self):

self.conn = self.conn()

# 析構(gòu)方法

def __del__(self):

self.conn.close()

# 創(chuàng)建數(shù)據(jù)庫連接對象

def conn(self):

host = self.hostName

user = self.userName

password = self.password

dbs = self.db

conn = pymysql.connect(host=host,

user=user,

password=password,

db=dbs,

charset='utf8mb4')

return conn

# 查詢數(shù)據(jù)

def selectwebsite(self):

dt = self.dt

conn = self.conn

cursor = conn.cursor()

sql = 'select id,website from %s order by id' % dt

try:

cursor.execute(sql)

return cursor.fetchall()

except:

print("%s" % sql)

# 修改數(shù)據(jù)

def update_shoulu(self, id, shoulu):

dt = self.dt

conn = self.conn

cursor = conn.cursor()

sql = 'update {_table} set shoulu = "{_shoulu}" where id = "{_id}"'.\

format(_table = dt, _shoulu = shoulu, _id = id)

try:

cursor.execute(sql)

# 提交數(shù)據(jù)

conn.commit()

except:

# 數(shù)據(jù)回滾

conn.rollback()

def commit(self):

self.conn.commit()

db = DataExec()

results = db.selectwebsite()

for row in results:

id = row[0]

website = row[1]

url = "https://www.baidu.com/s?ie=utf-8&f=8&rsv_bp=1&rsv_idx=1&tn=baidu&wd=site:" + website

# print(url)

try:

req = request.Request(url)

req.add_header('User-Agent', 'Mozilla/5.0 (Windows NT 6.2; WOW64) AppleWebKit/537.36 (KHTML, like Gecko) Chrome/27.0.1453.94 Safari/537.36')

# 直接請求

response = request.urlopen(req,timeout=15)

# 獲取狀態(tài)碼，如果是200表示成功

if response.status == 200:

# 讀取網(wǎng)頁內(nèi)容

html = response.read().decode('utf-8', 'ignore')

# print(html)

pattern = re.compile(r'找到相關(guān)結(jié)果數(shù)約(\d+?)個')

m = pattern.search(html)

print(m)

if m:

slnum = m.group(1)

print(id, website, '已收錄 ', slnum)

db.update_shoulu(id, slnum)

else:

pattern = re.compile(r'該網(wǎng)站共有\(zhòng)s*?(.+?)\s*?個網(wǎng)頁被百度收錄')

m = pattern.search(html)

if m:

slnum = m.group(1)

slnum = int(slnum.replace(',',''))

print(id, website, '已收錄 ', slnum)

db.update_shoulu(id, slnum)

else:

print(id, website)

except:

continue

time.sleep(1)

# 統(tǒng)一提交數(shù)據(jù)

# db.commit()

sys.exit()

上面代碼思路就是從數(shù)據(jù)庫中抓取出各個域名(website),然后使用Python抓取百度的查詢收錄的頁面，更新其參數(shù)，然后對于返回的結(jié)果使用正則匹配到對應的收錄結(jié)果。整理思路比較簡單，不熟悉的可以讀取代碼走一遍流程即可，需要的同學拿走

總結(jié)

以上是生活随笔為你收集整理的python批量查询数据库_Python + MySQL 批量查询百度收录的全部內(nèi)容，希望文章能夠幫你解決所遇到的問題。

如果覺得生活随笔網(wǎng)站內(nèi)容還不錯，歡迎將生活随笔推薦給好友。

上一篇：如何制造一款出色的狙击步枪？
下一篇：特种部队狙击手执行任务时究竟瞄哪里