json文件转xml
之前對阿里比賽的一個數據集格式進行轉換,數據集是瓶裝白酒瑕疵品檢測,數據集瓶裝酒的瑕疵分為3個大類:瓶蓋瑕疵、標貼瑕疵、噴碼瑕疵,以下列舉部分瑕疵示例圖像。
圖像都存放在images文件夾,圖像后綴為.jpg。標注文件為annotations.json。標注采用類似MSCOCO數據集的標注格式(http://cocodataset.org),數據結構如下:
{
"images":
[
{"file_name":"cat.jpg", "id":1, "height":1000, "width":1000},
{"file_name":"dog.jpg", "id":2, "height":1000, "width":1000},
…
]
"annotations":
[
{"image_id":1, "bbox":[100.00, 200.00, 10.00, 10.00], "category_id": 1}
{"image_id":2, "bbox":[150.00, 250.00, 20.00, 20.00], "category_id": 2}
…
]
"categories":
[
{"id":0, "name":"bg"}
{"id":1, "name":"cat"}
{"id":2, "name":"dog"}
…
]
}
標注文件中,"images" 關鍵字對應圖片信息;
"annotations" 關鍵字對應標注信息;
"categories" 對應類別信息:
"images": 該關鍵字對應的數據中,每一項對應一張圖片,"file_name"對應圖片名稱,"id"對應圖片序號,"height"和"width"分別對應圖像的高和寬。
"annotations": 該關鍵字對應的數據中,每一項對應一條標注,"image_id"對應圖片序號,"bbox"對應標注矩形框,順序為[x, y, w, h],分別為該矩形框的起始點x坐標,起始點y坐標,寬、高(左上角坐標,寬高)。"category_id"對應類別序號。
"categories": 該關鍵字對應的數據中,每一項對應一個類別,"id"對應類別序號,"name"對應類別名稱。
關鍵字關聯說明:
例: 在上面列出的數據結構中
{"image_id":1, "bbox":[100.00, 200.00, 10.00, 10.00], "category_id": 1}
這條標注信息通過"image_id"可以找到對應的圖像為"cat.jpg",通過"category_id"可以找到對應的類別為"cat"。
背景圖片說明:
"annotations"中的元素,"category_id":0對應的是背景。當且僅當一張圖片對應的所有annotations中,"category_id"都為0,該圖片為背景圖片。
代碼如下:
# .txt-->.xml # ! /usr/bin/python # -*- coding:UTF-8 -*- import os import cv2 import jsondef json_to_xml(data, img_path, xml_path):# 1.字典對標簽中的類別進行轉換dict = {'0': "background",'1': "Broken bottle cap",'2': "Bottle cap deformation",'3': "Broken edge of bottle cap",'4': "Bottle cap spinning",'5': "Cap breakpoint",'6': "Label skew",'7': "Label wrinkled",'8': "Label bubble",'9': "Code spraying is normal",'10': "Code spraying is not normal"}# 用于存儲 "老圖"pre_img_name = ''# 3.遍歷文件夾for id, name in data_dic.items():print(name)img_name = name[:-4]pic = cv2.imread(img_path + name)# 獲取圖像大小信息Pheight, Pwidth, Pdepth = pic.shapefor ann in data['annotations']:if id == ann['image_id']:# 遇到的是一張新圖片if ann['image_id'] != pre_img_name:# 6.新建xml文件xml_file = open((xml_path + img_name + '.xml'), 'w')xml_file.write('<annotation>\n')xml_file.write(' <folder>VOC2007</folder>\n')xml_file.write(' <filename>' + name + '</filename>\n')xml_file.write('<source>\n')xml_file.write('<database>orgaquant</database>\n')xml_file.write('<annotation>organoids</annotation>\n')xml_file.write('</source>\n')xml_file.write(' <size>\n')xml_file.write(' <width>' + str(Pwidth) + '</width>\n')xml_file.write(' <height>' + str(Pheight) + '</height>\n')xml_file.write(' <depth>' + str(Pdepth) + '</depth>\n')xml_file.write(' </size>\n')xml_file.write(' <segmented>0</segmented>\n')xml_file.write(' <object>\n')xml_file.write('<name>' + dict[str(ann['category_id'])] + '</name>\n')xml_file.write(' <pose>Unspecified</pose>\n')xml_file.write(' <truncated>0</truncated>\n')xml_file.write(' <difficult>0</difficult>\n')xml_file.write(' <bndbox>\n')xml_file.write(' <xmin>' + str(int(ann['bbox'][0])) + '</xmin>\n')xml_file.write(' <ymin>' + str(int(ann['bbox'][1])) + '</ymin>\n')xml_file.write(' <xmax>' + str(int(ann['bbox'][0] + ann['bbox'][2])) + '</xmax>\n')xml_file.write(' <ymax>' + str(int(ann['bbox'][1] + ann['bbox'][3])) + '</ymax>\n')xml_file.write(' </bndbox>\n')xml_file.write(' </object>\n')xml_file.close()pre_img_name = ann['image_id'] # 將其設為"老"圖else: # 不是新圖而是"老圖"# 7.同一張圖片,只需要追加寫入objectxml_file = open((xml_path + img_name + '.xml'), 'a')xml_file.write(' <object>\n')xml_file.write('<name>' + dict[str(ann['category_id'])] + '</name>\n')xml_file.write(' <bndbox>\n')xml_file.write(' <xmin>' + str(int(ann['bbox'][0])) + '</xmin>\n')xml_file.write(' <ymin>' + str(int(ann['bbox'][1])) + '</ymin>\n')xml_file.write(' <xmax>' + str(int(ann['bbox'][0]+ann['bbox'][2])) + '</xmax>\n')xml_file.write(' <ymax>' + str(int(ann['bbox'][1]+ann['bbox'][3])) + '</ymax>\n')xml_file.write(' </bndbox>\n')xml_file.write(' </object>\n')xml_file.close()# 8.讀完txt文件最后寫入</annotation>xml_file1 = open((xml_path + img_name + '.xml'), 'a')xml_file1.write('</annotation>')xml_file1.close()print("Done !")# 修改成自己的文件夾 注意文件夾最后要加上/ if __name__ == '__main__':f = open('annotations.json', encoding='utf-8')data = json.load(f)data_dic = {}# 把圖片名和id對應起來存到字典中for img in data['images']:data_dic[img['id']] = img['file_name'].strip()json_to_xml(data,"images/", "XML/")總結
以上是生活随笔為你收集整理的json文件转xml的全部內容,希望文章能夠幫你解決所遇到的問題。
- 上一篇: Java的随机数原理
- 下一篇: Revit二次开发之双事件:空闲事件与D