﻿{"id":51,"date":"2011-11-26T22:27:33","date_gmt":"2011-11-26T13:27:33","guid":{"rendered":"http:\/\/fujiitoshiki.com\/improvesociety\/?p=51"},"modified":"2014-08-09T16:59:22","modified_gmt":"2014-08-09T07:59:22","slug":"input-pdf-files-standard-japanese-food-composition-table-2010-output-csv-files","status":"publish","type":"post","link":"https:\/\/www.fujiitoshiki.com\/improvesociety\/?p=51","title":{"rendered":"How to input PDF files of &#8216;Standard Tables of Food Composition in Japan 2010&#8217; and output text file?"},"content":{"rendered":"<div class=\"theContentWrap-ccc\"><p>MEXT, Ministry of Education,Culture,Sports,Science &#038; Technology in Japan has published PDF files called &#8216;Standard Tables of Food Composition in Japan 2010&#8217;. I&#8217;m interest in those files, so I have converted them to text file.<\/p>\n<p><a href=\"\/\/www.mext.go.jp\/b_menu\/shingi\/gijyutu\/gijyutu3\/houkoku\/1298713.htm\">Standard Tables of Food Composition in Japan 2010<\/a><\/p>\n<p>The procedure is as follows:<\/p>\n<ol>\n<li>Download PDF files<\/li>\n<li>Open a file, Copy all text<\/li>\n<li>Paste to EXCEL<\/li>\n<li>Process with VBA<\/li>\n<\/ol>\n<h3>1. Download PDF files<\/h3>\n<p>Download the PDF files on the above link. Top of the list is 1299012_1. Last of the list is 1299012_18.<\/p>\n<h3>2. Open a file, Copy all text<\/h3>\n<p>Open a file with Adobe Reader. In order to select all text in document, you have to select &#8216;View&#8217; -> &#8216;Page view&#8217; -> <strong>&#8216;NOT Single page&#8217;<\/strong>. Select All, copy.<\/p>\n<h3>3. Paste to EXCEL<\/h3>\n<p>&#8216;Paste option&#8217; -> &#8216;Use text file wizard&#8217;. Select option &#8216;The data field separated by delimiters such as comma or tab&#8217;. Change option data type of the first column to &#8216;String&#8217;, click &#8216;Finish&#8217;.<\/p>\n<h3>4. Process with VBA<\/h3>\n<p>Press &#8216;Alt&#8217; key and &#8216;F11&#8217; key to launch VBE. &#8216;Insert&#8217; -> &#8216;Standard Module&#8217;. Paste the code below.<\/p>\n<pre class=\"lang:vb decode:true \" >\r\nOption Explicit\r\nSub Procedure1()\r\nDim mySht           As Worksheet\r\nDim myAr1           As Variant\r\nDim myAr2           As Variant\r\nDim i               As Long\r\nDim j               As Long\r\nDim k               As Long\r\nDim myAr(1878, 53)  As String\r\nDim RegEx1          As Object\r\nDim Match1          As Object\r\nDim Matches1        As Object\r\nDim strPtn1         As String\r\nDim tempStr         As String\r\nDim RegEx2          As Object\r\nDim Match2          As Object\r\nDim Matches2        As Object\r\nDim strPtn2         As String\r\nSet mySht = ActiveSheet\r\nSet myAr1 = Application.Intersect(mySht.UsedRange, mySht.Range(\"A:B\"))\r\nSet myAr2 = mySht.UsedRange\r\nSet RegEx1 = CreateObject(\"VBScript.RegExp\")\r\nSet RegEx2 = CreateObject(\"VBScript.RegExp\")\r\nstrPtn2 = \"^[0-9]+(\\.[0-9]*)?\"\r\ni = 0\r\nFor k = 1 To myAr1.Rows.Count\r\n    If Len(myAr1.Cells(k, 1).Value) = 5 And _\r\n        myAr1.Cells(k, 1).Value >= \"01001\" And _\r\n        myAr1.Cells(k, 1).Value <= \"18016\" And _\r\n        myAr1.Cells(k, 2) <> \"\uff08\u6b20\u756a\uff09\" Then\r\n        myAr(i, 0) = myAr2.Cells(k, 1)\r\n        strPtn1 = \"^[0-9]+$\"\r\n        RegEx1.Pattern = strPtn1\r\n        RegEx1.IgnoreCase = True\r\n        RegEx1.Global = True\r\n        Set Matches1 = RegEx1.Execute(myAr2.Cells(k, 2))\r\n        If Matches1.Count > 0 Then\r\n            tempStr = myAr2.Cells(k, 2) & myAr2.Cells(k, 3)\r\n            strPtn1 = \"[^0-9][0-9]+$\"\r\n            RegEx1.Pattern = strPtn1\r\n            RegEx1.IgnoreCase = True\r\n            RegEx1.Global = True\r\n            Set Matches1 = RegEx1.Execute(tempStr)\r\n            If Matches1.Count > 0 Then\r\n                Set Match1 = Matches1.Item(Matches1.Count - 1)\r\n                myAr(i, 1) = Left(tempStr, Match1.firstindex + 1)\r\n                myAr(i, 2) = Mid(tempStr, Match1.firstindex + 2)\r\n                For j = 3 To 52\r\n                    Select Case True\r\n                    Case myAr2.Cells(k, j + 1).Value = \"-\"\r\n                        myAr(i, j) = 0\r\n                    Case myAr2.Cells(k, j + 1).Value = \"Tr\"\r\n                        myAr(i, j) = 0\r\n                    Case myAr2.Cells(k, j + 1).Value = \"(Tr)\"\r\n                        myAr(i, j) = 0\r\n                    Case myAr2.Cells(k, j + 1).Value = \"(0)\"\r\n                        myAr(i, j) = 0\r\n                    Case Else\r\n                        myAr(i, j) = myAr2.Cells(k, j + 1).Value\r\n                    End Select\r\n                Next j\r\n                RegEx2.Pattern = strPtn2\r\n                RegEx2.IgnoreCase = True\r\n                RegEx2.Global = True\r\n                Set Matches2 = RegEx2.Execute(myAr2.Cells(k, 54))\r\n                On Error Resume Next\r\n                Set Match2 = Matches2.Item(0)\r\n                On Error GoTo 0\r\n                myAr(i, 53) = Match2.Value\r\n            Else\r\n                For j = 1 To 52\r\n                    Select Case True\r\n                    Case myAr2.Cells(k, j + 1).Value = \"-\"\r\n                        myAr(i, j) = 0\r\n                    Case myAr2.Cells(k, j + 1).Value = \"Tr\"\r\n                        myAr(i, j) = 0\r\n                    Case myAr2.Cells(k, j + 1).Value = \"(Tr)\"\r\n                        myAr(i, j) = 0\r\n                    Case myAr2.Cells(k, j + 1).Value = \"(0)\"\r\n                        myAr(i, j) = 0\r\n                    Case Else\r\n                        myAr(i, j) = myAr2.Cells(k, j + 1).Value\r\n                    End Select\r\n                Next j\r\n                    RegEx2.Pattern = strPtn2\r\n                    RegEx2.IgnoreCase = True\r\n                    RegEx2.Global = True\r\n                    Set Matches2 = RegEx2.Execute(myAr2.Cells(k, 54))\r\n                    On Error Resume Next\r\n                    Set Match2 = Matches2.Item(0)\r\n                    On Error GoTo 0\r\n                    myAr(i, 53) = Match2.Value\r\n            End If\r\n        Else\r\n            strPtn1 = \"[^0-9][0-9]+$\"\r\n            RegEx1.Pattern = strPtn1\r\n            RegEx1.IgnoreCase = True\r\n            RegEx1.Global = True\r\n            Set Matches1 = RegEx1.Execute(myAr2.Cells(k, 2))\r\n            If Matches1.Count > 0 Then\r\n                Set Match1 = Matches1.Item(Matches1.Count - 1)\r\n                myAr(i, 1) = Left(myAr2.Cells(k, 2), Match1.firstindex + 1)\r\n                myAr(i, 2) = Mid(myAr2.Cells(k, 2), Match1.firstindex + 2)\r\n                For j = 3 To 52\r\n                    Select Case True\r\n                    Case myAr2.Cells(k, j).Value = \"-\"\r\n                        myAr(i, j) = 0\r\n                    Case myAr2.Cells(k, j).Value = \"Tr\"\r\n                        myAr(i, j) = 0\r\n                    Case myAr2.Cells(k, j).Value = \"(Tr)\"\r\n                        myAr(i, j) = 0\r\n                    Case myAr2.Cells(k, j).Value = \"(0)\"\r\n                        myAr(i, j) = 0\r\n                    Case Else\r\n                        myAr(i, j) = myAr2.Cells(k, j).Value\r\n                    End Select\r\n                Next j\r\n                RegEx2.Pattern = strPtn2\r\n                RegEx2.IgnoreCase = True\r\n                RegEx2.Global = True\r\n                Set Matches2 = RegEx2.Execute(myAr2.Cells(k, 53))\r\n                On Error Resume Next\r\n                Set Match2 = Matches2.Item(0)\r\n                On Error GoTo 0\r\n                myAr(i, 53) = Match2.Value\r\n            Else\r\n                For j = 1 To 52\r\n                    Select Case True\r\n                    Case myAr2.Cells(k, j + 1).Value = \"-\"\r\n                        myAr(i, j) = 0\r\n                    Case myAr2.Cells(k, j + 1).Value = \"Tr\"\r\n                        myAr(i, j) = 0\r\n                    Case myAr2.Cells(k, j + 1).Value = \"(Tr)\"\r\n                        myAr(i, j) = 0\r\n                    Case myAr2.Cells(k, j + 1).Value = \"(0)\"\r\n                        myAr(i, j) = 0\r\n                    Case Else\r\n                        myAr(i, j) = myAr2.Cells(k, j + 1).Value\r\n                    End Select\r\n                Next j\r\n                    RegEx2.Pattern = strPtn2\r\n                    RegEx2.IgnoreCase = True\r\n                    RegEx2.Global = True\r\n                    Set Matches2 = RegEx2.Execute(myAr2.Cells(k, 54))\r\n                    On Error Resume Next\r\n                    Set Match2 = Matches2.Item(0)\r\n                    On Error GoTo 0\r\n                    myAr(i, 53) = Match2.Value\r\n            End If\r\n        End If\r\n    Else\r\n        i = i - 1\r\n    End If\r\n    i = i + 1\r\nNext k\r\nSet mySht = Worksheets.Add\r\nWith mySht\r\n    .Name = myAr2.Cells(1, 1).Value & myAr2.Cells(1, 2).Value\r\n    .Range(\"A1:BB1878\").Value = myAr\r\nEnd With\r\nSet mySht = Nothing\r\nSet myAr1 = Nothing\r\nSet myAr2 = Nothing\r\nSet RegEx1 = Nothing\r\nSet Match1 = Nothing\r\nSet Matches1 = Nothing\r\nSet RegEx2 = Nothing\r\nSet Match2 = Nothing\r\nSet Matches2 = Nothing\r\nEnd Sub\r\n<\/pre>\n<p>Please delete worksheets except for worksheets which above code added. Run following code:<\/p>\n<pre class=\"lang:vb decode:true \" >\r\nSub AllSheets_to_TextFile()\r\nDim myBook          As Workbook\r\nDim mySht           As Worksheet\r\nDim tmpSht          As Worksheet\r\nDim myRng           As Range\r\nDim myAr(1877, 53)  As String\r\nDim tempAr          As Variant\r\nDim i               As Long\r\nDim j               As Long\r\nDim k               As Long\r\nDim GOF             As Variant\r\nDim RegExp          As Object\r\nDim Matches         As Object\r\nDim Match           As Object\r\nDim strPtn          As String\r\nSet RegExp = CreateObject(\"VBScript.RegExp\")\r\nstrPtn = \"\\\\\"\r\nk = 0\r\nFor Each tmpSht In Worksheets\r\n    Set myRng = tmpSht.Range(\"A1\").CurrentRegion\r\n    tempAr = myRng\r\n    For i = LBound(tempAr) To UBound(tempAr)\r\n        For j = LBound(tempAr, 2) To UBound(tempAr, 2)\r\n            myAr(k, j - 1) = tempAr(i, j)\r\n        Next j\r\n        k = k + 1\r\n    Next i\r\nNext tmpSht\r\nGOF = Application.GetOpenFilename(FileFilter:=\"PDF file,*.pdf\", _\r\n                                  Title:=\"Select PDF file\", _\r\n                                  MultiSelect:=False)\r\nIf TypeName(GOF) = \"Boolean\" Then Exit Sub\r\nGOF = Left(GOF, Len(GOF) - 4) & \".txt\"\r\nWith RegExp\r\n    .Pattern = strPtn\r\n    .IgnoreCase = True\r\n    .Global = True\r\nEnd With\r\nSet Matches = RegExp.Execute(GOF)\r\nGOF = Left(GOF, Matches.Item(Matches.Count - 1).FirstIndex) & \"\\M_FOODS.txt\"\r\nSet mySht = Worksheets.Add\r\nWith mySht\r\n    .Name = \"M_FOODS\"\r\n    .Range(\"A1:BB1878\") = myAr\r\n    .Move\r\nEnd With\r\nActiveWorkbook.SaveAs Filename:=GOF, _\r\n                      FileFormat:=xlText, _\r\n                      CreateBackup:=False\r\nSet myBook = ActiveWorkbook\r\nApplication.DisplayAlerts = False\r\nmyBook.Close\r\nApplication.DisplayAlerts = True\r\nSet Match = Nothing\r\nSet Matches = Nothing\r\nSet RegExp = Nothing\r\nSet myRng = Nothing\r\nSet mySht = Nothing\r\nSet tmpSht = Nothing\r\nSet myBook = Nothing\r\nEnd Sub\r\n<\/pre>\n<p>I can not upload the text file because of copyright. Please contact MEXT about legal issues.<\/p>\n<\/div>","protected":false},"excerpt":{"rendered":"<p>MEXT, Ministry of Education,Culture,Sports,Science &#038; Technology in Japan has published PDF files called &#038; &hellip; <a href=\"https:\/\/www.fujiitoshiki.com\/improvesociety\/?p=51\" class=\"more-link\"><span class=\"screen-reader-text\">&#8220;How to input PDF files of &#8216;Standard Tables of Food Composition in Japan 2010&#8217; and output text file?&#8221; \u306e<\/span>\u7d9a\u304d\u3092\u8aad\u3080<\/a><\/p>\n","protected":false},"author":1,"featured_media":6026,"comment_status":"open","ping_status":"open","sticky":false,"template":"","format":"standard","meta":{"_crdt_document":"","_monsterinsights_skip_tracking":false,"_monsterinsights_sitenote_active":false,"_monsterinsights_sitenote_note":"","_monsterinsights_sitenote_category":0,"footnotes":""},"categories":[6,4],"tags":[303,304,277],"class_list":["post-51","post","type-post","status-publish","format-standard","has-post-thumbnail","hentry","category-excel","category-nutrition","tag-mext","tag-ministry-of-education-culture-sports-science-technology-in-japan","tag-standard-tables-of-food-composition-in-japan-2010"],"aioseo_notices":[],"_links":{"self":[{"href":"https:\/\/www.fujiitoshiki.com\/improvesociety\/index.php?rest_route=\/wp\/v2\/posts\/51","targetHints":{"allow":["GET"]}}],"collection":[{"href":"https:\/\/www.fujiitoshiki.com\/improvesociety\/index.php?rest_route=\/wp\/v2\/posts"}],"about":[{"href":"https:\/\/www.fujiitoshiki.com\/improvesociety\/index.php?rest_route=\/wp\/v2\/types\/post"}],"author":[{"embeddable":true,"href":"https:\/\/www.fujiitoshiki.com\/improvesociety\/index.php?rest_route=\/wp\/v2\/users\/1"}],"replies":[{"embeddable":true,"href":"https:\/\/www.fujiitoshiki.com\/improvesociety\/index.php?rest_route=%2Fwp%2Fv2%2Fcomments&post=51"}],"version-history":[{"count":23,"href":"https:\/\/www.fujiitoshiki.com\/improvesociety\/index.php?rest_route=\/wp\/v2\/posts\/51\/revisions"}],"predecessor-version":[{"id":6212,"href":"https:\/\/www.fujiitoshiki.com\/improvesociety\/index.php?rest_route=\/wp\/v2\/posts\/51\/revisions\/6212"}],"wp:featuredmedia":[{"embeddable":true,"href":"https:\/\/www.fujiitoshiki.com\/improvesociety\/index.php?rest_route=\/wp\/v2\/media\/6026"}],"wp:attachment":[{"href":"https:\/\/www.fujiitoshiki.com\/improvesociety\/index.php?rest_route=%2Fwp%2Fv2%2Fmedia&parent=51"}],"wp:term":[{"taxonomy":"category","embeddable":true,"href":"https:\/\/www.fujiitoshiki.com\/improvesociety\/index.php?rest_route=%2Fwp%2Fv2%2Fcategories&post=51"},{"taxonomy":"post_tag","embeddable":true,"href":"https:\/\/www.fujiitoshiki.com\/improvesociety\/index.php?rest_route=%2Fwp%2Fv2%2Ftags&post=51"}],"curies":[{"name":"wp","href":"https:\/\/api.w.org\/{rel}","templated":true}]}}