์ด๋ฒ ๊ธ์์๋ ๋ฐ์ดํฐ์ ์ข ๋ฅ, Parsing(ํ์ฑ) & Parser(ํ์), Parsing์ ์ค์ ์ฌ์ฉ ๋ฐฉ๋ฒ์ ์ดํด๋ณด๋๋ก ํ์.
1. CSV, XML, JSON ๐
(1) CSV (Comma Separated Value) ๐ฑ
CSV๋ ๋ช ๊ฐ์ง ํ๋๋ฅผ ์ผํ(,)๋ก ๊ตฌ๋ถํ ํ ์คํธ ๋ฐ์ดํฐ ๋ฐ ํ ์คํธ ํ์ผ์ด๋ค.
name,age,visitTime
์ตํ๋,23,2023-01-28
์ด๊ตฌ๋ฆ,15,2023-01-29
๊น์ฐ,32,2023-01-30
(2) XML (eXtensible Markup Language) ๐ณ
XML์ HTML๊ณผ ๋งค์ฐ ๋น์ทํ ๋ฌธ์ ๊ธฐ๋ฐ์ ๋งํฌ์ ์ธ์ด(text-based markup language)์ด๋ค. ํ๊ทธ๋ฅผ ํตํด ๋ฐ์ดํฐ ํ์์ ์ ์ํ๋ค. XML ํ๊ทธ๋ HTML ํ๊ทธ์ฒ๋ผ ๋ฏธ๋ฆฌ ์ ์๋์ด ์์ง ์๊ณ , ์ฌ์ฉ์๊ฐ ์ง์ ์ ์ํ ์ ์๋ค.
<visitorResult>
<visitRange>20230124~20230130</visitRange>
<visitors>
<visitor>
<name>์ตํ๋</name>
<age>23</age>
<visitTime>2023-01-28</visitTime>
</visitor>
<visitor>
<name>์ด๊ตฌ๋ฆ</name>
<age>15</age>
<visitTime>2023-01-29</visitTime>
</visitor>
<visitor>
<name>๊น์ฐ</name>
<age>32</age>
<visitTime>2023-01-30</visitTime>
</visitor>
</visitors>
</visitorResult>
(3) JSON (Javascript Object Notation) ๐ด
JSON์ ํค(Key)์ ๊ฐ(Value)์ ์์ผ๋ก ์ด๋ฃจ์ด์ ธ ์๋ ๋ฐ์ดํฐ ํ์์ด๋ค.
{
"visitorResult": {
"visitRange": "20230124~20230130",
"visitors": [
{
"name": "์ตํ๋",
"age": 23,
"visitTime": "2023-01-28"
},
{
"name": "์ด๊ตฌ๋ฆ",
"age": 15,
"visitTime": "2023-01-29"
},
{
"name": "๊น์ฐ",
"age": 32,
"visitTime": "2023-01-30"
}
]
}
}
๊ฐ๊ฐ์ ํน์ง์ ์๋์ ๊ฐ๋ค.
CSV | XML | JSON | |
์ฅ์ | ์ฉ๋์ด ๊ฐ์ฅ ์์ CSV๋ ์ฉ๋์ด ์๊ธฐ ๋๋ฌธ์ ๋ณํ์ง ์๋ ๋ง์ ์์ ๋ฐ์ดํฐ๋ฅผ ์ ๊ณตํ ๋ ์ฃผ๋ก ์ด์ฉ์ด ๊ฐ๋ฅ |
XML์ 3๊ฐ์ง ํ์ ์ค ๊ฐ์ฅ ์ง๊ด์ ๋ฉํ ์ ๋ณด๋ฅผ ์ ๊ณตํด์ ์ ๋ณด ํํ ์ด์์ ํจ๊ณผ๋ฅผ ๊ฐ์ ธ์ฌ ์ ์์ |
๋ชจ์๊ณผ ๊ท์น ์์ฒด๊ฐ ๋จ์ํด์ ํ ์ธ์ด์์๋ ๊ตฌํํ๊ธฐ๊ฐ ์ฌ์ |
๋จ์ | ๋ฐ์ดํฐ๊ฐ ๋ง์์ง๋ฉด ์ด๋ค ๋ฐ์ดํฐ๊ฐ ํญ๋ชฉ์ ๋ํ๋ด๋์ง ๊ฐ์ํ๊ฐ ์ด๋ ค์ | ์ค์ ์ ์กํ๋ ์ ๋ณด๋ณด๋ค ๋ฉํ ๋ฐ์ดํฐ์ ํฌ๊ธฐ๊ฐ ๋ ์ปค์ง ์ ์์ | ์ฝค๋ง๊ฐ ๋๋ฝ๋๊ฑฐ๋ ์ค๊ดํธ๊ฐ ์ ๋ชป ๋ซํ๋ ๋ฑ ๋ฌธ๋ฒ ์ค๋ฅ์ ์ทจ์ฝํ๋ค. |
์ฃผ์ ์ฌ์ฉ์ฒ | ๊ฐ๋จํ ํ ์ด๋ธ ์์ฑ ๋๋ ์ฝ๋ ์๋๊ฐ ์ค์ํ ๋ถ๋ถ์์ ์ฌ์ฉ | ๋จ์ ๊ฒ์ ์ต์ , ์ง์ ๋ฐ์ดํฐ ์์ ์ด ์ฆ์ ๋ถ๋ถ์ ์ฌ์ฉ | ์๋ฒ ํต์ REST API๋ฅผ ์ฌ์ฉํ ๋ ๊ฐ์ฅ ๋ง์ด ์ฌ์ฉ |
2. Parsing (ํ์ฑ) & Parser(ํ์) โ
Parsing์ด๋ ๋ฌธ์์์ ํ์ํ ์ ๋ณด๋ฅผ ์ป๊ธฐ ์ํด ํ๊ทธ๋ฅผ ๊ตฌ๋ณํ๊ณ ๋ด์ฉ์ ์ถ์ถํ๋ ๊ณผ์ ์ด๋ค. ๋ฌธ์ฅ์ด ์ด๋ฃจ๊ณ ์๋ ๊ตฌ์ฑ ์ฑ๋ถ์ ๋ถํดํ๊ณ ๋ถํด๋ ์ฑ๋ถ์ ์๊ณ ๊ด๊ณ๋ฅผ ๋ถ์ํ์ฌ ๋ด์ฉ์ ์ป๋๋ค.
Parsing์ ๊ธฐ๋ฒ์ผ๋ก๋ XML Parsing๊ณผ JSON Parsing์ด ์๋ค. (CSV์ ๊ฒฝ์ฐ์๋ ์ฝค๋ง๋ก ์์ฝ๊ฒ ๋ฐ์ดํฐ๋ฅผ ๋ค๋ฃฐ ์ ์๋ค.)
Parser๋ Parsing์ ์ํํ๋ ํ๋ก๊ทธ๋จ์ด๋ค. Parser๋ Compiler์ ์ผ๋ถ๋ก ์ปดํ์ผ๋ฌ๋ ์ธํฐํ๋ฆฌํฐ์์ ์์ ํ๋ก๊ทธ๋จ์ ์ฝ์ด ๋ค์ฌ ๊ทธ ๋ฌธ์ฅ์ด ๊ตฌ์กฐ๋ฅผ ์์๋ด๋ Parsing์ ์ํํ๋ค.
3. XML Parsing ๐
XML Parsing์๋ SAX(Simple API for XML) Parser์ DOM(Document Object Model) Parser๋ฅผ ์ฌ์ฉํ ์ ์๋ค. SAX Parsers๋ ๋ฌธ์๋ฅผ ์ฝ์ผ๋ฉด์ ํ๊ทธ์ ์์, ์ข ๋ฃ ๋ฑ ์ด๋ฒคํธ ๊ธฐ๋ฐ์ผ๋ก ์ฒ๋ฆฌ๋ฅผ ํ๋ ๋ฐฉ์์ด๊ณ , DOM Parser๋ ๋ฌธ์๋ฅผ ๋ค ์ฝ๊ณ ๋ ํ ๋ฌธ์ ๊ตฌ์กฐ ์ ์ฒด๋ฅผ ์๋ฃ๊ตฌ์กฐ์ ์ ์ฅํ์ฌ ํ์ํ๋ ๋ฐฉ์์ด๋ค.
SAX Parser | DOM Parser | |
ํน์ง | ์๋์ ์ผ๋ก ๋น ๋ฅด๋ค. ๋ฉ๋ชจ๋ฆฌ ์ ์ฅ X |
์๋์ ์ผ๋ก ๋๋ฆฌ๋ค. ๋ฉ๋ชจ๋ฆฌ ์ ์ฅ O |
์ฌ์ฉ์ฒ | ์ฝ๊ธฐ ์ ์ฉ | XML ๋ด์ฉ์ ์์ ํ๊ธฐ ์ํ ์ฉ๋ |
'visitor.xml'๊ณผ Visitor.java๋ฅผ ๊ณตํต์ผ๋ก ์ฌ์ฉํ๊ณ SAX Parser์ DOM Parser๋ฅผ ์ด์ฉํด์ Parsing ํด๋ณด์.
<visitorResult>
<visitRange>20230124~20230130</visitRange>
<visitors>
<visitor>
<name>์ตํ๋</name>
<age>23</age>
<visitTime>2023-01-28</visitTime>
</visitor>
<visitor>
<name>์ด๊ตฌ๋ฆ</name>
<age>15</age>
<visitTime>2023-01-29</visitTime>
</visitor>
<visitor>
<name>๊น์ฐ</name>
<age>32</age>
<visitTime>2023-01-30</visitTime>
</visitor>
</visitors>
</visitorResult>
public class Visitor {
private String name;
private Integer age;
private Date visitTime;
public String getName() {
return name;
}
public void setName(String name) {
this.name = name;
}
public Integer getAge() {
return age;
}
public void setAge(Integer age) {
this.age = age;
}
public Date getVisitTime() {
return visitTime;
}
public void setVisitTime(Date visitTime) {
this.visitTime = visitTime;
}
public Date toDate(String date) {
Date dateObj = null;
SimpleDateFormat format = new SimpleDateFormat("YYYY-MM-dd");
try {
dateObj = format.parse(date);
} catch (ParseException e) {
e.printStackTrace();
}
return dateObj;
}
@Override
public String toString() {
return "Visitor [name=" + name + ", age=" + age + ", visitTime=" + visitTime + "]";
}
}
(1) SAX Parser ๐ต
SAX Parser๋ฅผ ์ฌ์ฉํ๊ธฐ ์ํด์๋ DefaultHandler๋ฅผ ์์ํ๋ค. ๊ทธ๋ฆฌ๊ณ startDocument(), endDocument(), startElement(), endElement(), characters()๋ฅผ ์ค๋ฒ๋ผ์ด๋ฉ ํด์ฃผ๋ฉด ๋๋ค.
๋ํ, SAXParserFactory๋ฅผ ์ด์ฉํ์ฌ ํฉํ ๋ฆฌ๋ฅผ ๋จผ์ ์์ฑํ๊ณ , ๊ทธ๊ฒ์ ์ด์ฉํด์ SAXParser๋ฅผ ๋ง๋ ๋ค. ๋ง๋ค์ด์ง parser๋ฅผ ์ด์ฉํด์ parse๋ฅผ ํด์ฃผ๋ฉด ๋๋ค.
public class VisitorSaxParser extends DefaultHandler {
private final File xml = new File("./visitor.xml");
private List<Visitor> list = new ArrayList<>();
private Visitor current;
private String content;
public List<Visitor> getVisitor() {
try {
SAXParserFactory factory = SAXParserFactory.newInstance();
SAXParser parser = factory.newSAXParser();
parser.parse(xml, this);
} catch (IOException | ParserConfigurationException | SAXException e) {
e.printStackTrace();
}
return list;
}
@Override
public void startDocument() throws SAXException {
System.out.println("Document Parsing START");
}
@Override
public void endDocument() throws SAXException {
System.out.println("Document Parsing END");
}
@Override
public void startElement(String uri, String localName, String qName, Attributes attributes) throws SAXException {
if(qName.equals("visitor")) {
current = new Visitor();
}
}
@Override
public void endElement(String uri, String localName, String qName) throws SAXException {
if(qName.equals("visitor")) {
list.add(current);
} else if(qName.equals("name")) {
current.setName(content);
} else if(qName.equals("age")) {
current.setAge(Integer.parseInt(content));
} else if(qName.equals("visitTime")) {
current.setVisitTime(current.toDate(content));
}
}
@Override
public void characters(char[] ch, int start, int length) throws SAXException {
this.content = new String(ch, start, length);
}
}
public class SaxParserTest {
public static void main(String[] args) {
VisitorSaxParser handler = new VisitorSaxParser();
List<Visitor> visitors = handler.getVisitor();
for (Visitor visitor : visitors) {
System.out.println(visitor);
}
}
}
// Document Parsing START
// Document Parsing END
// Visitor [name=์ตํ๋, age=23, visitTime=Sun Jan 01 00:00:00 KST 2023]
// Visitor [name=์ด๊ตฌ๋ฆ, age=15, visitTime=Sun Jan 01 00:00:00 KST 2023]
// Visitor [name=๊น์ฐ, age=32, visitTime=Sun Jan 01 00:00:00 KST 2023]
๋์ ๋ฐฉ์์ ๋ฌธ์๋ฅผ ์ฝ๋ค๊ฐ ๋ฐ์ํ๋ ์ด๋ฒคํธ ๊ธฐ๋ฐ์ผ๋ก ๋ฌธ์๋ฅผ ์ฒ๋ฆฌํ๋ค. startDocument(), endDocument(), startElement(), endElement(), characters()๊ฐ ์ด๋ฒคํธ ํธ๋ค๋ฌ์ด๋ค. ์๋์ ํ๋ฅผ ๋ณด๋ฉด ์ฝ๊ฒ ์ดํด๊ฐ ๋ ๊ฒ์ด๋ค.
์ด๋ฒคํธ ๋ฐ์ ์์ & ์ด๋ฒคํธ ํธ๋ค๋ฌ | ์ด๋ฒคํธ ํธ๋ค๋ฌ |
<visitorResult>(1) ... <visitor>(2) <name>(2)์ตํ๋(3)</name>(4) <age>(2)23(3)</age>(4) <visitTime>(2)2023-01-28(3)</visitTime>(4) </visitor>(4) <visitor>(2) <name>(2)์ด๊ตฌ๋ฆ(3)</name>(4) <age>(2)15(3)</age>(4) <visitTime>(2)2023-01-29(3)</visitTime>(4) </visitor>(4) ... </visitorResult>(5) |
(1) startDocument() (2) startElement() (3) characters() (4) endElement() (5) endDocument() |
(2) DOM Parser ๐
DOM Parser๋ root๋ฅผ ๊ธฐ์ค์ผ๋ก ๊ณ์ํด์ child๋ฅผ ๋ง๋ค๋ฉด์ ํด๋นํ๋ ๊ฐ๋ค์ ๊ฐ์ ธ์ค๋ฉด ๋๋ค. ์ด๊ฒ์ด ๊ฐ๋ฅํ ์ด์ ๋ DOM Tree๋ก ์ธํด ๋ฌธ์๋ฅผ ๊ตฌํํ๋ ๋ชจ๋ ์์๊ฐ Node๋ก ๊ตฌ์ฑ๋์ด ์๊ธฐ ๋๋ฌธ์ด๋ค. ํฉํ ๋ฆฌ๋ฅผ ์์์ผ๋ก ์งํํ๋ ๊ฒ์ SAX Parser์ ๋น์ทํ๋ค.
public class VisitorDomParser {
private final File xml = new File("./visitor.xml");
private List<Visitor> list = new ArrayList<>();
public List<Visitor> getVisitor() {
try {
DocumentBuilderFactory factory = DocumentBuilderFactory.newInstance();
DocumentBuilder builder = factory.newDocumentBuilder();
Document doc = builder.parse(xml);
Element root = doc.getDocumentElement();
parse(root);
} catch(IOException | ParserConfigurationException | SAXException e) {
e.printStackTrace();
}
return list;
}
private void parse(Element root) {
NodeList visitors = root.getElementsByTagName("visitor");
for(int i = 0; i < visitors.getLength(); i++) {
Node child = visitors.item(i);
list.add(getVisitor(child));
}
}
private Visitor getVisitor(Node node) {
Visitor visitor = new Visitor();
NodeList childs = node.getChildNodes();
for(int i = 0; i < childs.getLength(); i++) {
Node child = childs.item(i);
if(child.getNodeName().equals("name")) {
visitor.setName(child.getTextContent());
} else if(child.getNodeName().equals("age")) {
visitor.setAge(Integer.parseInt(child.getTextContent()));
} else if(child.getNodeName().equals("visitTime")) {
visitor.setVisitTime(visitor.toDate(child.getTextContent()));
}
}
return visitor;
}
}
public class DomParserTest {
public static void main(String[] args) {
VisitorDomParser parser = new VisitorDomParser();
List<Visitor> visitors = parser.getVisitor();
for (Visitor visitor : visitors) {
System.out.println(visitor);
}
}
}
// Visitor [name=์ตํ๋, age=23, visitTime=Sun Jan 01 00:00:00 KST 2023]
// Visitor [name=์ด๊ตฌ๋ฆ, age=15, visitTime=Sun Jan 01 00:00:00 KST 2023]
// Visitor [name=๊น์ฐ, age=32, visitTime=Sun Jan 01 00:00:00 KST 2023]
4. JSON Parsing ๐
JSON Parsing์ ์๋์ ์ผ๋ก ๊ฐ๋จํ๋ค. ObjectMapper๋ฅผ ์ด์ฉํด์ mapper๋ฅผ ์ง์ ํด ์ฃผ๊ณ ๊ฐ์ ์ฝ์ด์จ๋ค. ๊ทธ๋ฆฌ๊ณ ํด๋น ๊ฐ์์ ํค(Key)๋ฅผ ์ด์ฉํ์ฌ ๊ฐ(Value)์ ๊ฐ์ ธ์ค๋ ๊ฒ์ ์ง์ ํ ํ, convertValue๋ฅผ ์ด์ฉํ๋ฉด ์๋์ผ๋ก Visitor ํด๋์ค์ ๊ฐ์ด ์ ์ฅ๋๋ค.
public class JsonParser {
private final File json = new File("./visitor.json");
private List<Visitor> list = new ArrayList<>();
@SuppressWarnings({ "unchecked", "rawtypes" })
public List<Visitor> getVisitor() {
ObjectMapper mapper = new ObjectMapper();
mapper.setDateFormat(new SimpleDateFormat("yyyy-MM-dd"));
try {
Map<String, Map<String, Object>> result = mapper.readValue(json, Map.class);
List<Map<String, Object>> list = (List) result.get("visitorResult").get("visitors");
for(Map<String, Object> item : list) {
this.list.add(mapper.convertValue(item, Visitor.class));
}
} catch (JsonParseException e) {
e.printStackTrace();
} catch (JsonMappingException e) {
e.printStackTrace();
} catch (IOException e) {
e.printStackTrace();
}
return list;
}
}
public class JsonParserTest {
public static void main(String[] args) {
JsonParser parser = new JsonParser();
List<Visitor> visitors = parser.getVisitor();
for (Visitor visitor : visitors) {
System.out.println(visitor);
}
}
}
// Visitor [name=์ตํ๋, age=23, visitTime=Sat Jan 28 00:00:00 KST 2023]
// Visitor [name=์ด๊ตฌ๋ฆ, age=15, visitTime=Sun Jan 29 00:00:00 KST 2023]
// Visitor [name=๊น์ฐ, age=32, visitTime=Mon Jan 30 00:00:00 KST 2023]
XML Parsing์์ ์ฌ์ฉํ Visitor Class์ 2๊ฐ์ง ๋ณํ๊ฐ ์๋ค. ์ฒซ ๋ฒ์งธ๋ก๋ JsonParser์์ dateFormat์ ์ง์ ํด ์ค ์ ์๊ธฐ ๋๋ฌธ์ toDate๋ฅผ ํ ํ์๊ฐ ์๋ค. ๋ ๋ฒ์งธ๋ก๋ @JsonIgnoreProperties(ignoreUnknown = true)๋ฅผ ์ค์ ํด ์ฃผ์๋ค. JSON์ ๋ชจ๋ ์์๋ฅผ Parsing ํ๋ ค๊ณ ํ๋ค. ํ์ง๋ง Visitor ํด๋์ค์์๋ ์ ์ธ๋์ง ์์ ์์๋ฅผ Parsing ์ ํ๊ณ ์ถ์๋ฐ ์ด๋ด ๋ ์ฌ์ฉํ๋ ๊ฒ์ด @JsonIgnoreProperties(ignoreUnknown = true)์ด๋ค.
@JsonIgnoreProperties(ignoreUnknown = true)
public class Visitor {
private String name;
private Integer age;
private Date visitTime;
public String getName() {
return name;
}
public void setName(String name) {
this.name = name;
}
public Integer getAge() {
return age;
}
public void setAge(Integer age) {
this.age = age;
}
public Date getVisitTime() {
return visitTime;
}
public void setVisitTime(Date visitTime) {
this.visitTime = visitTime;
}
@Override
public String toString() {
return "Visitor [name=" + name + ", age=" + age + ", visitTime=" + visitTime + "]";
}
}
[์ฐธ๊ณ ์๋ฃ]
[JAVA] Parsing์ด๋ ๋ฌด์์ธ๊ฐ?
๋ฐ์ดํฐ ํฌ๋งท(XML, JSON, CSV)
SAX (Simple API for XML)
'๐ JAVA > ์ฃผ์ ๊ฐ๋ ' ์นดํ ๊ณ ๋ฆฌ์ ๋ค๋ฅธ ๊ธ
๊ณต๋ณ ๋ฐํํ์ (Convariant Return Type) (0) | 2023.03.09 |
---|---|
๋ง์ปค ์ธํฐํ์ด์ค (Marker Interface) (0) | 2023.03.09 |
์ ์ถ๋ ฅ(I/O)(2) - ์ง๋ ฌํ(Serialization) (2) | 2023.01.29 |
์ ์ถ๋ ฅ(I/O)(1) - ๋ ธ๋ ์คํธ๋ฆผ, ๋ณด์กฐ ์คํธ๋ฆผ (0) | 2023.01.29 |
๋๋ค์(Lambda Expression) - (2) (0) | 2023.01.29 |