Precision Nutrient Management Using Artificial Intelligence Based on Digital Data Collection Framework

Lee, Hsiu-An; Huang, Tzu-Ting; Yen, Lo-Hsien; Wu, Pin-Hua; Chen, Kuan-Wen; Kung, Hsin-Hua; Liu, Chen-Yi; Hsu, Chien-Yeh

doi:10.3390/app12094167

Open AccessArticle

Precision Nutrient Management Using Artificial Intelligence Based on Digital Data Collection Framework

¹

National Health Research Institutes-The National Institute of Cancer Research, Tainan 704, Taiwan

²

Department of Information Management, National Taipei University of Nursing and Health Sciences, Taipei 112, Taiwan

³

College of Health Technology, National Taipei University of Nursing and Health Sciences, Taipei 112, Taiwan

⁴

Master Program in Global Health and Development, College of Public Health, Taipei Medical University, Taipei 110, Taiwan

^*

Authors to whom correspondence should be addressed.

Appl. Sci. 2022, 12(9), 4167; https://0-doi-org.brum.beds.ac.uk/10.3390/app12094167

Submission received: 17 February 2022 / Revised: 18 April 2022 / Accepted: 19 April 2022 / Published: 20 April 2022

(This article belongs to the Special Issue Innovative Applications of Big Data and Cloud Computing)

Download

Browse Figures

Versions Notes

Abstract

:

(1) Background: Nutritional intake is fundamental to human growth and health, and the intake of different types of nutrients and micronutrients can affect health. The content of the diet affects the occurrence of disease, with the incidence of many diseases increasing each year while the age group at which they occur is gradually decreasing. (2) Methods: An artificial intelligence model for precision nutritional analysis allows the user to enter the name and serving size of a dish to assess a total of 24 nutrients. A total of two AI models, including semantic and nutritional analysis models, were integrated into the Precision Nutritional Analysis. A total of five different algorithms were used to identify the most similar recipes and to determine differences in text using cosine similarity. (3) Results: This study developed two models to form a precision nutrient analysis model. The 2013–2016 Taiwan National Nutrition Health Status Change Survey (NNHS) was used for model verification. The model’s accuracy was determined by comparing the results of the model with the NNHS. The results show that the AI model has very little error and can significantly improve the efficiency of the analysis. (4) Conclusions: This study proposed an Intelligence Precision Nutrient Analysis Model based on a digital data collection framework, where the nutrient intake was analyzed by entering dietary recall data. The AI model can be used as a reference for nutrition surveys and personal nutrition analysis.

Keywords:

nutrition survey; precision diet analysis; medical intelligence

1. Introduction

Nutritional intake is the basis for human growth and health, and the intake of different types of nutrients and micronutrients can affect health. Most diseases are inextricably linked to diet. Diabetes, cardiovascular diseases (hypertension, hyperlipidemia), gout, peptic ulcers, and gastroenteritis are all diet-related diseases that are increasing in prevalence every year, while the age group of those suffering from these diseases is gradually decreasing. The development of the Internet has made it possible to conduct online nutrition surveys through large-scale food and nutrition databases linked to automated dietary records, and there are now a growing number of software, platforms, and applications for nutrition surveys [1].

The most common technologies used for dietary recording are web-based or online tools, mobile apps, camera-based image analysis tools, wearable sensors, etc., while traditional methods rely on the use of Food Frequency Questionnaires (FFQs) or 24 h dietary recording methods. However, past techniques have suffered from a lack of accuracy in recording, as recall methods may not accurately record the food consumed or have difficulty estimating portion sizes or limited food ingredient lists [2].

The coding and translation of food records from nutrition surveys into nutrient analyses are labor-intensive and time-consuming, meaning that it is more difficult to collect detailed information regarding food intake in large scale population studies. Such studies rely on answers to food frequency questionnaires, and the accuracy of this data is dependent on the expertise of the interviewer compared to other self-reported measures [3,4].

Innovative technological tools have evolved with the development of various IT technologies, including natural language analysis of text, speech analysis, and image processing. The popularity of smartphones, tablets, and computers has increased the acceptance of using IT for nutritional intake assessments [5,6,7,8].

This study develops an artificial intelligence model for a precision nutrient analysis, which allows users to enter the name of a dish and serving size to assess a total of 24 nutrients. The recipes can be modified by the user, which allows the model to be used in all countries and all contexts, thus improving interoperability and accuracy of the analysis.

2. Related Works

The Food Record, the 24 h dietary recall (24HR), and the Food Frequency Questionnaires (FFQs) are three common methods of collecting nutritional data. The Food Record is a comprehensive record of all foods, beverages, and nutritional supplements consumed by the respondent over a specified period of time. Usually, 3–4 days of intake are recorded, as the quality (accuracy) of the record is reduced due to the burden of recording too many days. Ideally, dietary intake should be weighed and measured; however, most respondents only recorded pre and postestimates of intake, which would lead to differences in weight judgments [9].

The 24HR method assesses the nutritional intake of a respondent over the past 24 h. Ideally, the survey collects information on nutritional intake over multiple 24 h periods on nonconsecutive, random dates. The 24HR method is usually conducted by a dedicated interviewer by telephone or in person [9]. Some 24HR surveys can also be self-recorded or collected online (e.g., Automated Self-Administered 24 h dietary recall and ASA24 [10]). The differences between the ASA24 and 24HR methods primarily reduce interviewer burden and interview costs and allow respondents to answer questions at their own pace; however, this method may not be suitable for all study populations.

The use of exploratory questions in the 24HR recall method facilitates easy response and has been shown to improve the accuracy of data collection [11]. The survey includes how the food was prepared, what was added after preparation (seasonings, creams, and spices), and when the meal was served [9]. The FFQ assesses general nutritional intake over a specific period of time, usually a longer period, and asks how often a person consumes food. The FFQ method is a more cost-effective alternative to the 24HR method because respondents can complete the survey themselves, and it can be used for large sample studies [12].

There are several types of systematic measures of self-reported dietary information; for example, based on general perceptions, most respondents tend to report foods that are perceived as healthy and to report less on less healthy foods. However, differences in susceptibility to this tendency between groups of respondents can lead to additional personal bias. Differences in the ability to self-assess and recall portions can also lead to individual subjective differences. This systematic error is unpredictable, but studies suggest that it may be related to factors such as age and gender [13]. While each person uses different strategies to recall portion sizes, including taking photographs and using measurement aids to estimate (e.g., food models) [14,15], research shows that training can lead to a more accurate assessment of food portions [16,17]. In addition, researchers or the methods used to collect dietary data may also be biased [18].

Finally, the accuracy of the conversion of nutrient totals from nutrition dietary records depends on the accuracy and availability of the food ingredient database for conversion to calories and nutrients. In summary, both types of errors reduce the judgement of the relationship between diet and health, as well as the accuracy of the statistical analysis. However, while there may be some slight deviations in the database of the relationships tested [19], when the results of significant analyses are properly evaluated, valid conclusions can be drawn.

3. Materials and Methods

This study developed an AI model based on semantic text to analyze the nutritional ingredients of a nutrient, and a digital data semantic analysis model was designed to determine the names and servings of the dishes consumed. The AI model is based on the ingredients of common Taiwanese recipes and automatically calculates the nutrient intake. The model structure consists of a digital data semantic analysis model, an AI precision nutrient analysis model, a database of 1590 recipes, and 7869 ingredients from common Taiwanese recipe databases, and the model structure is shown in Figure 1. The nutrition information of the ingredients was obtained from the public data of the Health Promotion Administration, Ministry of Health and Welfare Taiwan (HPA, MoHW).

3.1. Artificial Intelligence Semantic Analysis Model

Data were intercepted and annotated after data entry, and a CKIP pretraining model was used to interpret Chinese words. After completion, lexical annotation and entity identification were performed. Finally, the nouns (dish names) were converted into vector structures using word2vec, which is an application of Natural Language Processing proposed by Tomas Mikolov et al. at Google in 2013 and is one of the most significant advances in the field of machine learning in recent years. Word2vec is an application framework that learns large amounts of textual data and transforms words into mathematical vectors to discriminate their semantic meanings by embedding words into a two-dimensional space in order that words with similar semantic meanings can be closer together.

This study used the continuous bag-of-words (CBOW) method, which aims to determine the lexical properties of the input words using a whole paragraph of context and to determine the relationship between similar words by concatenating them. As similar words are clustered together, the direction of the vector corresponds to the relative relationship.

3.2. Artificial Intelligence Nutritional Analysis Model

The Nutritional Analysis Model is divided into three steps. Step 1 conducts artificial intelligence analysis to determine the most similar recipes. Due to the multicharacter nature of Chinese, single algorithm of semantic analysis may not be precise enough. Therefore, a variety of algorithms were used for the analysis. The AI model is composed of five different algorithms, including 1. Okapi BM25, 2. TF-IDF, 3. Levenshtein, 4. Jaccard, and 5. Synonyms. The algorithm also uses cosine similarity to determine differences in text and then compares it with a database to obtain food information and portion sizes for recipes and ingredient judgement. Step 2 is to determine the best solution by the common voting mechanism. Step 3 is nutritional ingredient calculation.

3.2.1. Step 1. Artificial Intelligence Analysis

(1): Okapi Best Matching (Okapi BM25)

This algorithm was proposed by Stephen E. Robertson, Karen Spärck Jones, and other scholars in 1970 [20,21,22]. As a probabilistic search framework, BM25 is still widely regarded as one of the most advanced ranking algorithms. BM25 is a bag-of-words model, which ranks a set of documents based on their similarity to each other and obtains a set of scores that can be compared with each other.

The BM25 similarity formula is shown in Equation (1).

score (D, Q) = \sum_{i = 1}^{n} IDF (q_{i}) [\frac{f (q_{i}, D) (k_{1} + 1)}{f (q_{i}, D) + k_{1} (1 - b + b \frac{|D|}{avgdl})} + δ]

(1)

Equation (1) BM25 Similarity Formula

f(q_i,D): Frequency of the term q_i in Document D₀
|D|: Length of Document D (in words).
K₁: The terminology described above is saturated with parameters.
b: The length normalization parameters, as described above.
avgdl: Average document length in document collection.
IDF: Frequency of inverse text files.
n(q_i): Number of documents containing q_i.
n: Total number of text files in the collection.

(2): Term Frequency–Inverse Document Frequency (TF–IDF)

This algorithm is a weighting technique widely used in information retrieval and text mining, and the combination of TF and IDF was first discussed by Karen Spärck Jones [23]. The TF–IDF was used to assess the importance of a word in a document, which increased positively with the number of times the word appears in the document but decreased inversely with the frequency of its occurrence. The TF–IDF formula is shown in Equation (2), while the Inverse Document Frequency IDF Formula is shown in Equation (3).

{tf}_{i, j} = \frac{n_{i, j}}{\sum_{k} n_{k, j}}

(2)

Equation (2) TF–IDF formula

Molecular formula: n_i,j denotes the number of occurrences of the word in document d_j.
Denominator: The sum of all occurrences of the word in document d_j.

{idf}_{i} = \lg \frac{|D|}{|\{j : t_{i} ϵ d_{j}\}|}

(3)

Equation (3) Inverse Document Frequency IDF Formula

Molecular formula: total number of documents.
Denominator: the number of documents containing the term.
The result of the calculation is obtained by quoting the logarithm of the number of documents with a base of 10.

(3): Levenshtein

The Russian scientist Vladimir Levenshtein first proposed this algorithm in 1965 [24]. The basic form of Levenshtein is carried out using a regressive algorithm, where a threshold can be set as an upper limit for the number of steps to be moved. The Levenshtein distance formula is shown in Equation (4).

{lev}_{a, b} (i, j) = {\begin{matrix} \max (i, j) & if \min (i, j) = 0, \\ \min {\begin{matrix} {lev}_{a, b} (i - 1, j) + 1 \\ {lev}_{a, b} (i, j - 1) + 1 \\ {lev}_{a, b} (i - 1, j - 1) + 1_{(a_{i} \neq b_{j})} \end{matrix} & otherwise . \end{matrix}

(4)

Equation (4) Levenshtein Distance Formula

(4): Jaccard

The intersection and union of the two samples can be used to derive the Jaccard similarity coefficient and Jaccard distance for different applications [25]. Jaccard’s coefficient gives the degree of similarity and the ratio between the size of the intersection of two sets and the size of the union in a finite set of samples. The Jaccard index formula is shown in Equation (5).

J (A, B) = \frac{|A \cap B|}{|A \cup B|} = \frac{|A \cap B|}{|A| + |B| - |A \cap B|}

(5)

Equation (5) Jaccard Index Formula

(5): Synonyms

Synonyms is an open-source package for natural language tasks in Python and maintained by Chatopera. It provides a variety of NLP tasks, such as text alignment, recommendation algorithms, similarity calculation, semantic shifting, keyword extraction, concept extraction, automatic summarization, and search engines with a multisource lexical database for predata use. Regarding the word vector conversion task, the suite uses Google’s gensim suite with a word2vec model for conversion and the vector distance of words with a smooth gradient descent algorithm for approximation [26].

3.2.2. Step 2. Common Voting Mechanism

In this study, the same approximation task was assigned to the abovementioned five different algorithms, and after obtaining the best dish selection results for each algorithm, the highest vote was tallied as the best solution by pooling. The confidence scores of the algorithms were not equally comparable among the different algorithms (Levenshtein distance does not have a confidence score, but a minimum step), as the meanings of the confidence scores of the algorithms are limited to intragroup comparisons. For this reason, instead of using the average of the sum of similar scores for the same project, the highest score of each algorithm was used for vote recognition, and in the final vote counting process, the votes for each algorithm were equal, which rendered it a fair majority vote decision.

3.2.3. Step 3. Nutritional Ingredient Analysis

The recipe data were obtained through a fuzzy analysis of the artificial intelligence model, and the nutritional ingredient analysis automatically determined all the ingredients in the dish. Finally, this study consolidated all the nutrients by means of portion calculation to complete the nutrient analysis. The dietary information conversion process is shown in Figure 2.

4. Results

This study developed two models to form a precision nutrient analysis model. The first model is a Digitized Data Semantic Analysis Model for dish analysis and portion size determination. The second model is a Nutrient Analysis Model that uses five different algorithms to find precision recipes, which conducts analyses of dish ingredients and nutrients using a common voting process, and the final outputs from both models calculate the intake of 24 common nutrients. The operational framework of the model is illustrated below. The recipe database contains 1590 recipes and nutrient information for 7869 ingredients. The model operating framework is shown in Figure 3.

4.1. Operation Example

An example of a dietary recall record for precise nutritional analysis is as follows:

Input the dietary record to the model.
Dietary Record: “Today I had a plate of cabbage with pork fat and a bowl of bamboo shoots and pork ribs soup.”
The names of the dishes and the portion sizes were analyzed by the Semantic Analysis Model. Nd is defined as time; Nf is defined as a quantity, and Na is defined as a common noun.
Segmentation of record:

[{“label”: ”Today”, ”Pos”: ”Nd”}, {“label”: ”plate”, ”Pos”: ”Nf”}, {“label”: ”cabbage”, ”Pos”: ”Na”}, {“label”: ”pork fat”, ”Pos”: ”Na”}, {“label”: ”bowl”,”Pos”: ”Nf”}, {“label”: ” bamboo shoots”, ”Pos”: ”Na”}, {“label”: ” pork ribs”, ”Pos”: ”Na”}, {“label”: ” soup”, ”Pos”: ”Na”}]

In this study, the plate of the dish represents 200 g, and the soup bowl represents 200 g.
Nutrition intake calculation by the Precision Nutrient Analysis Model.
(1)
In Step 1: Each dish was separated into its ingredients according to the recipes.
200 g of cabbage with pork fat: Cabbage: 163 g, Pork Fat: 5.63 g, Carrots: 28.74 g, and Salt: 2.63 g.
200 g of bamboo shoots and pork ribs soup: Water: 50.13 g, White pepper: 0.63 g, Bamboo shoot: 73.43 g, Pork chops: 75.19 g, and Salt: 0.63 g.
(2)
In Step 2: 24 nutrients were calculated for each ingredient, and the precision nutrient analysis results were calculated based on the sum of all nutrients.

4.2. Model Accuracy Verification

The accuracy of the model was analyzed using data from the Nutrition Survey. In this study, the 2013–2016 National Nutrition Health Status Change Survey (NNHS) was used for analysis. The NNHS was initiated by the HPA MoHW and conducted in a four-year cycle and considered county and city distribution, as well as seasonal effects. The collected data were used as a reference for the formulation of national nutrition and health-related policies in Taiwan.

The aim of the survey is to understand the nutrition, health, diet, and lifestyle of the Taiwanese people and their relevance, in order to establish a long-term, stable, and nationally representative nutrition and health surveillance mechanism. The results can be used as a basis for government policies regarding diet and nutrition and health promotion and disease prevention and can help improve the health status of the population and prevent possible future health problems.

The NNHS uses a multistage stratified cluster sampling design, with the sample group being the entire age cohort, excluding pregnant and breastfeeding women, people without self-awareness, and institutional care residents, and the overall sample is representative of the Taiwanese population. The nutrition data were stored in a 24 h dietary memory record and analyzed by a professional nutritionist.

4.2.1. Data Resource

The “2013–2016 National Survey of Changes in Nutritional Health Status” was used to validate the accuracy of the model. The data contain a “24-h dietary recall nutrient intake sum analysis file” and a “24-h dietary recall food weight and nutrient ingredient file” with the information of 24 nutrients, including Energy, Water, Protein, Lipid Fat, Sugars Total, Calcium Ca, Phosphorus P, Iron Fe, VitaminB1, VitaminB2, Nicotinic, Vitamin C, Saturated Fat, Cholesterol, Vitamin E alpha TE, Sodium Na, VitaminB6, Magnesium, Dietary Fiber, Potassium K, Equivalent, VitaminB12, Zinc Zn, and VitaminD2D3.

24 h dietary recall nutrient intake sum analysis file (sum_nutrients_24hH—total 2602 data entries. This file includes the data of the total nutrient intake in a single 24 h dietary recall survey.
24 h dietary recall food weight and nutrient ingredient file (food_wt_and_nutrients)—totaling 113,824 data entries. This file includes the data of the sum of nutrients for each individual dish, food, health product, etc.

4.2.2. Validation Process

(1): Inputting data from the “102–105 National Survey of Changes in Nutritional Health Status” into a digitized data semantic analysis model;
(2): Model analysis of dishes, portion sizes, and the ingredients in the dishes;
(3): Analysis of nutrient intake using the AI Precision Nutrient Analysis Model;
(4): Analyze the results against the “24-h dietary recall nutrient intake sum analysis file” and the “24-h dietary recall food weight and nutrient ingredient file”;
(5): Compare the accuracy of the model.

4.3. Analysis Result

The results of the nutrition survey team analysis (from the 24 h dietary recall nutrient intake sum analysis file) were used as the gold standard, while the results of this study model analysis were used as the control group for the nutrient difference ratio analysis. The discrepancy comparison tables of the NNHS analysis with the results of this study in the 24 h dietary recall nutrient intake sum analysis are shown in Table 1, Table 2 and Table 3.

A total of 2602 data entries were analyzed for total nutrient intake, with 24 different nutrients analyzed for each data item. The differences between the results of this study and the results of the nutrition survey are shown in Table 1, Table 2 and Table 3. While 13 nutrients had a total of more than 95% (2472 data entries) of the data with an intake error of <5%, 3 nutrients had a total of 90–94% of the data with an intake error of <5%; 5 nutrients had a total of 89.99–80% of the data with an intake error of <5%; Vitamin E alpha TE had a total of more than 95% (2472 data entries) with an intake error of <10%; Sugars Total and VitaminD2D3 had a 70% data error <10%.

The results of the nutrition survey team analysis (from 24 h dietary recall food weight and nutrient ingredient file) were used as the gold standard, while the results of this study model analysis were used as the control group for a nutrient difference ratio analysis. The discrepancy comparison table of the NNHS analysis with the results of this study in the 24 h dietary recall food weight and nutrient ingredients are shown in Table 4, Table 5 and Table 6.

A total of 113,824 data entries were analyzed for food weight and nutrient ingredients, with 24 different nutrients analyzed for each data item. The differences between the results of this study and the results of the nutrition survey are shown in Table 4, Table 5 and Table 6. While 3 nutrients had a total of more than 95% of the data with an intake error of <2%, 9 nutrients had a total of 90–94% of the data with an intake error of <2%; 12 nutrients had a total of 89.99–80% of the data with an intake error of <2%.

5. Discussion

Each 24 h dietary recall nutrition survey in this study took approximately 40 min. The volume and complexity of the survey data and the variation in the ability to self-assess and recall portions can lead to individual subjective differences [13]. Similarly, the researchers or the methods used to collect dietary data may be biased [18].

Therefore, this study balanced the accuracy of nutrient intake analysis by compensating for errors through fuzzy analysis and artificial intelligence. Conventional FFQs are primarily designed to assess total nutrient intake or changes in intake over time [27,28,29]; however, the FFQ limits the range of foods that can be investigated as it combines food and beverages thus determining the exact amount of nutrients is less precise than other more detailed methods. It is also not possible to accurately measure absolute intakes of different food components. Moreover, FFQs require literacy and the physical ability to complete the questionnaire, and the FFQ survey can be burdensome for subjects and difficult or confusing to complete due to poor descriptions or difficult-to-understand questions. The most commonly used methods in nutrition research are the Diet Record, 24HR, and FFQ.

The Food Record is also used as the gold standard in validation studies [30]. Given the contingent nature of the respondents’ food choices, a variety of food and beverage combinations [31] and nutrient supplementation [32] are the best methods to investigate. In order to reduce the burden on surveyors, the artificial intelligence model in this study has proven to be a feasible strategy for large-scale nutritional surveys after data discrepancy comparisons.

When comparing the difference between our model and the data analyzed in the actual nutrition survey, it was found that the results of the “24-h dietary recall food weight and nutrient ingredient” method were highly accurate, with less than 2% discrepancy in analysis for almost all nutrients. This result shows that the nutrients of the ingredient data in our model are correct. In the 24 h dietary recall nutrient intake sum analysis, the model was used to conduct an artificial intelligence analysis of the dishes, meaning it conducted an automated analysis of the components and servings to estimate nutrient intake. The results show a margin of error of less than 10% thus confirming the high accuracy of the model in this study.

6. Conclusions

This study proposed an Intelligence Precision Nutrient Analysis Model based on a digital data collection framework, where the nutrient intake was analyzed by entering dietary recall data. The AI Precision Nutrient Analysis Model was used to analyze the ingredients of the dishes and calculate nutrient intake by automatically analyzing the dishes, and portion sizes were analyzed using a digital data semantic analysis model. The results of this study show very little difference in nutrient intake between the model and the NNHS analysis and are highly accurate; therefore, the AI model can be used as a reference for nutrition surveys and personal nutrition analysis. In terms of data access, as there is not yet a complete set of publicly available data on food nutrient ingredients; more complete data and references on micro-nutrients should be available in the future. On the other hand, the scope of recipes should be expanded.

Author Contributions

The work presented in this paper was carried out in collaboration among all authors. H.-A.L. and C.-Y.L. formed the conception and study design. K.-W.C. and C.-Y.L. carried out the data analysis; H.-A.L. performed the literature review; C.-Y.L., T.-T.H., L.-H.Y., P.-H.W., and H.-H.K. performed the model development; H.-A.L., P.-H.W., and K.-W.C. drafted the manuscript, and C.-Y.H. made significant revisions and supplied valuable improvement suggestions. All authors have read and agreed to the published version of the manuscript.

Funding

This project received funding from the Ministry of Science and Technology, Taiwan, under the project no. 110-2221-E-227 -003 -MY3 and Ministry of Education, Taiwan, under the project no. 107EH12-402-022.

Institutional Review Board Statement

Not applicable.

Informed Consent Statement

Not applicable.

Data Availability Statement

The data presented in this study are available on request from the corresponding author.

Conflicts of Interest

The authors declare no conflict of interest.

References

Cade, J.E. Measuring diet in the 21st century: Use of new technologies. Proc. Nutr. Soc. 2017, 76, 276–282. [Google Scholar] [CrossRef] [PubMed]
Cade, J.E.; Consortium, O.B.O.T.D.; Warthon-Medina, M.; Albar, S.; Alwan, N.A.; Ness, A.; Roe, M.; Wark, P.A.; Greathead, K.; Burley, V.J.; et al. DIET@NET: Best Practice Guidelines for dietary assessment in health research. BMC Med. 2017, 15, 1–15. [Google Scholar] [CrossRef] [PubMed] [Green Version]
Freedman, L.S.; Schatzkin, A.; Midthune, D.; Kipnis, V. Dealing with Dietary Measurement Error in Nutritional Cohort Studies. JNCI 2011, 103, 1086–1092. [Google Scholar] [CrossRef] [PubMed] [Green Version]
Freedman, L.S.; Potischman, N.; Kipnis, V.; Midthune, U.; Schatzkin, A.; Thompson, F.; Troiano, R.P.; Prentice, R.; Patterson, R.; Carroll, R.; et al. A comparison of two dietary instruments for evaluating the fat–breast cancer relationship. Int. J. Epidemiol. 2006, 35, 1011–1021. [Google Scholar] [CrossRef] [Green Version]
Carter, M.C.; Burley, V.; Nykjaer, C.; Cade, J. Adherence to a Smartphone Application for Weight Loss Compared to Website and Paper Diary: Pilot Randomized Controlled Trial. J. Med. Internet Res. 2013, 15, e32. [Google Scholar] [CrossRef]
Timon, C.M.; Blain, R.J.; McNulty, B.; Kehoe, L.; Evans, K.; Walton, J.; Flynn, A.; Gibney, E.R. The Development, Validation, and User Evaluation of Foodbook24: A Web-Based Dietary Assessment Tool Developed for the Irish Adult Population. J. Med. Internet Res. 2017, 19, e158. [Google Scholar] [CrossRef] [Green Version]
Boushey, C.J.; Harray, A.J.; Kerr, D.A.; Schap, T.; Paterson, S.; Aflague, T.; Ruiz, M.B.; Ahmad, Z.; Delp, E.J.; Harnack, L.; et al. How Willing Are Adolescents to Record Their Dietary Intake? The Mobile Food Record. JMIR MHealth UHealth 2015, 3, e47. [Google Scholar] [CrossRef]
Thompson, F.E.; Dixit-Joshi, S.; Potischman, N.; Dodd, K.W.; Kirkpatrick, S.I.; Kushi, L.H.; Alexander, G.L.; Coleman, L.A.; Zimmerman, T.P.; Sundaram, M.E.; et al. Comparison of Interviewer-Administered and Automated Self-Administered 24-Hour Dietary Recalls in 3 Diverse Integrated Health Systems. Am. J. Epidemiol. 2015, 181, 970–978. [Google Scholar] [CrossRef] [Green Version]
Thompson, F.E.; Byers, T. Dietary assessment resource manual. J. Nutr. 1994, 124, s2245–s2317. [Google Scholar] [CrossRef]
Subar, A.F.; Kirkpatrick, S.I.; Mittl, B.; Zimmerman, T.P.; Thompson, F.E.; Bingley, C.; Willis, G.; Islam, N.G.; Baranowski, T.; McNutt, S.; et al. The automated self-administered 24-hour dietary recall (ASA24): A resource for researchers, clinicians, and educators from the national cancer institute. J. Acad. Nutr. Diet. 2012, 112, 1134–1137. [Google Scholar] [CrossRef] [Green Version]
Campbell, V.A.; Dodds, M.L. Collecting Dietary Information from Groups of Older People. Limitations of the 24-Hr. Recall. J. Am. Diet. Assoc. 1967, 51, 29–33. [Google Scholar] [CrossRef] [PubMed]
Heady, J.A. Diets of Bank Clerks Development of a Method of Classifying the Diets of Individuals for Use in Epidemiological Studies. J. R. Stat. Soc. Ser. A 1961, 124, 336. [Google Scholar] [CrossRef]
Dwyer, J.T.; Gardner, J.; Halvorsen, K.; Krall, E.A.; Cohen, A.; Valadian, I. MEMORY OF FOOD INTAKE IN THE DISTANT PAST. Am. J. Epidemiol. 1989, 130, 1033–1046. [Google Scholar] [CrossRef] [PubMed]
Iv, C.; Edgar; Godwin, S.L.; Vecchio, F.A. Cognitive strategies for reporting portion sizes using dietary recall procedures. J. Am. Diet. Assoc. 2000, 100, 891–897. [Google Scholar]
Guthrie, H.A. Selection and quantification of typical food portions by young adults. J. Am. Diet. Assoc. 1984, 84, 1440–1444. [Google Scholar] [CrossRef]
Bolland, J.E.; Yuhas, J.A.; Bolland, T.W. Estimation of food portion sizes: Effectiveness of training. J. Am. Diet. Assoc. 1988, 88, 817–821. [Google Scholar] [CrossRef]
Howat, P.M.; Mohan, R.; Champagne, C.; Monlezun, C.; Wozniak, P.; Bray, G. Validity and reliability of reported dietary intake data. J. Am. Diet. Assoc. 1994, 94, 169–173. [Google Scholar] [CrossRef]
Gibson, R.S. Principles of Nutritional Assessment; Oxford University Press: Oxford, UK, 2005. [Google Scholar]
Kirkpatrick, S.I.; Dodd, K.W.; Tooze, J.; Bailey, R.L.; Freedman, L.; Midthune, D. Measurement Error Webinar Series. Risk Factor Monitoring and Methods, National Cancer Institute, National Institutes of Health. 2012; Bethesda: Bethesda, MD, USA, 2012. [Google Scholar]
Gatford, S.E.R.; Walker, S.; Jones, S. Micheline Hancock-Beaulieu & Mike. In Proceedings of the Third Text REtrieval Conference (TREC 1994), Gaithersburg, MD, USA, 2–4 November 1994; Available online: https://trec.nist.gov/pubs/trec3/t3_proceedings.html (accessed on 1 January 2021).
Jones, K.S.; Walker, S.; Robertson, S. A probabilistic model of information retrieval: Development and comparative experiments: Part 2. Inf. Process. Manag. 2000, 36, 809–840. [Google Scholar] [CrossRef]
Jones, K.S.; Walker, S.; Robertson, S. A probabilistic model of information retrieval: Development and comparative experiments: Part 1. Inf. Process. Manag. 2000, 36, 779–808. [Google Scholar] [CrossRef]
Jones, K.S. Index term weighting. Inf. Storage Retr. 1973, 9, 619–633. [Google Scholar] [CrossRef]
Levenshtein, V.I. Binary Codes Capable of Correcting Deletions, Insertions and Reversals. Sov. Phys. Dokl. 1966, 10, 707. [Google Scholar]
Michael, L.; Winter, D. Distance between Sets. Nature 1971, 234, 34–35. [Google Scholar]
Chatopera. Synonyms. Available online: https://github.com/chatopera/Synonyms#references (accessed on 20 January 2021).
Tucker, K.L.; Chen, H.; Hannan, M.T.; Cupples, L.A.; Wilson, P.W.F.; Felson, D.; Kiel, D. Bone mineral density and dietary patterns in older adults: The Framingham Osteoporosis Study. Am. J. Clin. Nutr. 2002, 76, 245–252. [Google Scholar] [CrossRef] [PubMed] [Green Version]
Wirfält, A.E.; Jeffery, R.W. Using Cluster Analysis to Examine Dietary Patterns: Nutrient Intakes, Gender, and Weight Status Differ Across Food Pattern Clusters. J. Am. Diet. Assoc. 1997, 97, 272–279. [Google Scholar] [CrossRef]
Haveman-Nies, A.; Tucker, K.; de Groot, L.; Wilson, P.; van Staveren, W. Evaluation of dietary quality in relationship to nutritional and lifestyle factors in elderly people of the US Framingham Heart Study and the European SENECA study. Eur. J. Clin. Nutr. 2001, 55, 870–880. [Google Scholar] [CrossRef] [Green Version]
Walter, W. Nutritional Epidemiology; Oxford University Press: Oxford, UK, 2012; Volume 40. [Google Scholar]
Freedman, L.S.; Midthune, D.; Arab, L.; Prentice, R.L.; Subar, A.F.; Willett, W.; Neuhouser, M.L.; Tinker, L.F.; Kipnis, V. Combining a Food Frequency Questionnaire With 24-Hour Recalls to Increase the Precision of Estimation of Usual Dietary Intakes—Evidence from the Validation Studies Pooling Project. Am. J. Epidemiol. 2018, 187, 2227–2232. [Google Scholar] [CrossRef] [Green Version]
Nicastro, H.L.; Bailey, R.; Dodd, K.W. Using 2 Assessment Methods May Better Describe Dietary Supplement Intakes in the United States. J. Nutr. 2015, 145, 1630–1634. [Google Scholar] [CrossRef] [Green Version]

Figure 1. Model Structure.

Figure 2. Nutrition Survey—Dietary Information Conversion Process.

Figure 3. Model Operating Framework.

Table 1. Discrepancy comparison table of NNHS analysis with the results of this study in the sum_nutrients_24hr (1).

Error Range	Energy (kcal)		Water (g)		Protein (g)		Lipid Fat (g)		Sugars Total (g)		Calcium Ca (mg)		Phosphorus P (mg)		Iron Fe (mg)
<1%	2187	84.1%	2503	96.2%	2284	87.8%	1902	73.1%	352	13.5%	1168	44.9%	2087	80.2%	1401	53.8%
≥1% and <2%	297	11.4%	68	2.6%	228	8.8%	319	12.3%	436	16.8%	601	23.1%	306	11.8%	734	28.2%
≥2% and <3%	45	1.7%	24	0.9%	43	1.7%	136	5.2%	270	10.4%	298	11.5%	82	3.2%	199	7.6%
≥3% and <4%	22	0.8%	5	0.2%	12	0.5%	84	3.2%	186	7.1%	194	7.5%	44	1.7%	116	4.5%
≥4% and <5%	9	0.3%	1	0.0%	10	0.4%	42	1.6%	142	5.5%	109	4.2%	21	0.8%	39	1.5%
≥5% and <6%	8	0.3%	1	0.0%	8	0.3%	29	1.1%	91	3.5%	68	2.6%	15	0.6%	41	1.6%
6–10%	27	1.0%	0	0.0%	12	0.5%	56	2.2%	316	12.1%	122	4.7%	32	1.2%	51	2.0%
11–15%	6	0.2%	0	0.0%	2	0.1%	13	0.5%	172	6.6%	22	0.8%	8	0.3%	7	0.3%
16–20%	1	0.0%	0	0.0%	2	0.1%	6	0.2%	136	5.2%	11	0.4%	3	0.1%	7	0.3%
21–30%	0	0.0%	0	0.0%	0	0.0%	8	0.3%	170	6.5%	2	0.1%	2	0.1%	7	0.3%
31–40%	0	0.0%	0	0.0%	1	0.0%	6	0.2%	91	3.5%	5	0.2%	2	0.1%	0	0.0%
41–60%	0	0.0%	0	0.0%	0	0.0%	1	0.0%	121	4.7%	0	0.0%	0	0.0%	0	0.0%
61–80%	0	0.0%	0	0.0%	0	0.0%	0	0.0%	49	1.9%	2	0.1%	0	0.0%	0	0.0%
81–100%	0	0.0%	0	0.0%	0	0.0%	0	0.0%	32	1.2%	0	0.0%	0	0.0%	0	0.0%
>100%	0	0.0%	0	0.0%	0	0.0%	0	0.0%	0	0.0%	0	0.0%	0	0.0%	0	0.0%