• Lecture Notes
  • Problem Sets

Solutions to Problem Set 4 Doctor Visits

Cameron and Trivedi (2009) have some interesting data on the number of office-based doctor visits by adults aged 25-64 based on the 2002 Medical Expenditure Panel Survey. We will use data for the most recent wave, available at https://grodri.github.io/datasets/docvis.dta .

  •   R  

[1] A Poisson Model

(a) Fit a Poisson regression model with the number of doctor visits ( docvis ), as the outcome. We will use the same predictors as Cameron and Trivedi, namely health insurance status ( private ), health status ( chronic ), gender ( female ) and income ( income ), but will add two indicators of ethnicity ( black and hispanic ). There are many more variables one could add, but we’ll keep things simple.

(b) Interpret the coefficient of black and test its significance using a Wald test and a likelihood ratio test.

Blacks report 17% fewer visits to the doctor than white with the same insurance, health status, gender and income. The z-test of -5.12 in the output is equivalent to a χ 2 of 26.18 on one d.f. and is highly significant. The likelihood ratio test obtained by fitting the model without black gives a similar χ 2 of 27.67 on one d.f.

(c) Compute a 95% confidence interval for the effect of private insurance and interpret this result in terms of doctor visits.

We can obtain a 95% confidence interval by exponentiating the bounds reported in the output. In Stata you can use the eform option. We find that respondents with private insurance visit the doctor between 1.94 and 2.17 times as often as respondents without insurance who have the same observed characteristics, namely gender, ethnicity, health status and income.

(d) Compute the deviance and Pearson chi-squared statistics for this model. Does the model fit the data? Is there evidence of overdispersion?

The deviance of 27,870 on 4,405 d.f. is highly significant and the Pearson χ 2 of 55,945 is even worse. The model clearly does not fit the data. There is overwhelming evidence of overdispersion. In terms of Pearson’s χ 2 , the variance is 13 times the mean.

(e) Predict the proportion expected to have exactly zero doctor visits and compare with the observed proportion. You will find the formula for Poisson probabilities in the notes. The probability of zero is simply e − μ .

The Poisson model substantially underestimates the probability of zero doctor visits, predicting 11.3% when in fact 36.4% of respondents report zero visits.

[2] Poisson Overdispersion

(a) Suppose the variance is proportional to the mean rather than equal to the mean. Estimate the proportionality parameter using Pearson’s chi-squared and use this estimate to correct the standard errors.

We know from the previous result that the proportionality factor is 12.7. We therefore need to inflate the standard errors by a factor of √12.7 = 3.564 or almost four. In Stata we can do this calculation using the scale(x2) option.

(b) What happens to the significance of the black coefficient once we allow for extra-Poisson variation? Could we test this coefficient using a likelihood ratio test. Explain.

Once we adjust for overdispersion the black coefficient is no longer significant, with a z-statistic of -1.44 equivalent to a χ 2 of just 2.07. We have no evidence that blacks differ from comparable whites in the number of doctor visits. We can’t do a likelihood ratio test because we haven’t specified a likelihood.

(c) Compare the standard errors adjusted for over-dispersion with the robust or “sandwich” estimator of the standard errors. To obtain robust standard errors we follow the procedure outlined in this log .

The adjusted and robust estimates of standard errors are very similar and both much larger than the Poisson standard errors. (In case you are interested, the ratio of the robust to Poisson standard errors in this model is between 3.2 and 4.5.)

[3] A Negative Binomial Model

(a) Fit a negative binomial regression model using the same outcome and predictors as in part 1.a. Comment on any remarkable changes in the coefficients.

We need glm.nb() in the MASS package.

The estimates are very similar except for ethnicity, where the coefficient of black reflects a much larger negative effect, going from -0.187 to -0.306. Another change of note, but of smaller magnitude, is the coefficient of insurance, which now reflects a larger effect.

(b) Interpret the coefficient of black and test its significance using a Wald test and a likelihood ratio test. Compare your results with parts 1.b and 2.b

We estimate that blacks have 26.3% fewer visits to the doctor than comparable whites. The effect is significant, with a Wald test of z = 3.10, equivalent to a χ 2 of 9.61 on one d.f., and a likelihood ratio χ 2 of 9.07, also on one d.f.

The magnitude of the effect is larger than estimated under a Poisson model. The standard error is larger than the Poisson, but comparable to the overdispersed Poisson. On balance the effect turns out to be significant.

(c) Predict the percent of respondents with zero doctor visits according to this model and compare with part 2.c. You will find a formula for negative binomial probabilities in the addendum to the notes. The probability of zero is ( β /( μ  +  β )) α where α  =  β  = 1/ σ 2 .

We predict that 36.6% of respondents will have zero doctor visits. Much better than the Poisson estimate of 11.3% and remarkably close to the observed value of 36.4%

(d) Interpret the estimate of σ 2 in this model and test its significance, noting carefully the distribution of the criterion.

The estimate of 1.7 reflects substantial unobserved heterogeneity in doctor visits, even after we take into account the available indicators of insurance and health status, gender, income and ethnicity. The χ 2 statistic of 17,000 is clearly significant, exceeding by far the conservative critical value of 3.84, and hence even more significant if we treated it as a 50:50 mixture of χ 2 ’s with 0 and 1 d.f.

One way to assess the magnitud of this effect is to compute quartiles of the gamma distribution with mean 1 and variance 1.7

In terms of unobserved characteristics we see that respondents at the first quartile had 86% fewer visits than expected, those at the median had 48% fewer than expected, and those at the third quartile had 35% more than expected. The fact that the median is so far below the mean indicates a very long right tail, as shown in the next figure

doctor visit data set

Gamma Density with Variance 1.7

(e) Use predicted values from this model to divide the sample into twenty groups of about equal size, compute the mean and variance of docvis in each group, and plot these values. Superimpose curves representing the over-dispersed Poisson and negative binomial variance functions and comment.

doctor visit data set

Poisson and Negative Binomial Variance Functions

The situation at the high end is not clear at all, as one of the groups happens to have a much larger variance than its neighbors. The quadratic function comes closer to this point at the expense of a poorer fit through most of the range. On balance it seems to provide a better compromise at the high end, so I would say that the negative binomial is marginally better than the overdispersed Poisson.

[4] A Zero-Inflated Poisson Model

(a) Try a zero-inflated Poisson model with the same predictors of part 1a in both the Poisson and inflate equations.

We need the zeroinfl() function in the pscl package.

(b) Predict the proportion of respondents with zero doctor visits according to this model and compare with 1.e and 3.c. (Don’t forget that there are two ways of having an outcome of zero in this model.)

We predict that 36.4% of respondents would have no doctor visits, which not surprisingly, is almost exactly the observed proportion.

(c) Interpret the coefficients of black in the two equations. Is the effect related to whether blacks visit the doctor at all? To how often they visit?

The “always zero” class is often hard to interpret. In this case it could represent lack of access to health care, but it could also represent people who are in pretty good health and don’t need to see a doctor. I’ll couch the interpretation in terms of access to health care, which seems more credible, but recognize that this choice is debatable.

Blacks are slightly more likely than comparable whites to lack access to a doctor, but the difference is not significant. Among those with access to health care, blacks have 17% fewer visits than comparable whites. Clearly most of the difference comes from the number of visits.

Notes: one way to elaborate on this issue would be to predict zero visits setting everyone to white and then to black, but this was not asked. I get probabilities of 34.1% for white and 36.3% for blacks. One can also predict the expected number of visits by combining the two equations: I get 4.2 for whites and 3.4 for blacks. Latinos have even fewer adjusted visits, 3.1 on average, resulting both from less access to health care and fewer visits from those with access.

[5] Model Selection

Considering the results obtained so far and bearing in mind parsimony and goodness of fit, which of the models used here provides the best description of the data? Make sure you provide a clear justification of your choice.

Turns out this is fairly simple because the log-likehoods are -18,374.5 for Poisson, -15,956.73 for the zero-inflated model, which by the way uses twice as many parameters, and -9,829.31 for the negative binomial model, which uses just one more parameter than the Poisson. The clear winner is the negative binomial.

Moreover, the model does a pretty good job reproducing the excess zeroes without the need for a separate equation, which also makes it a winner in terms of interpretation; I find the idea of unobserved heterogeneity of frailty much easier to interpret than a latent class that never sees a doctor.

Thank you for visiting nature.com. You are using a browser version with limited support for CSS. To obtain the best experience, we recommend you use a more up to date browser (or turn off compatibility mode in Internet Explorer). In the meantime, to ensure continued support, we are displaying the site without styles and JavaScript.

  • View all journals
  • My Account Login
  • Explore content
  • About the journal
  • Publish with us
  • Sign up for alerts
  • Data Descriptor
  • Open access
  • Published: 16 June 2022

A dataset of simulated patient-physician medical interviews with a focus on respiratory cases

  • Faiha Fareez 1 , 2 ,
  • Tishya Parikh 1 , 2 ,
  • Christopher Wavell   ORCID: orcid.org/0000-0003-1571-8202 1 , 2 ,
  • Saba Shahab 1 , 2 ,
  • Meghan Chevalier 1 , 2 ,
  • Scott Good 1 , 2 ,
  • Isabella De Blasi 1 , 2 ,
  • Rafik Rhouma 2 , 3 , 4 ,
  • Christopher McMahon 2 , 3 ,
  • Jean-Paul Lam 2 , 3 ,
  • Thomas Lo 2 &
  • Christopher W. Smith 1 , 2  

Scientific Data volume  9 , Article number:  313 ( 2022 ) Cite this article

10k Accesses

2 Citations

2 Altmetric

Metrics details

  • Health care
  • Medical research

This article has been updated

Artificial Intelligence (AI) is playing a major role in medical education, diagnosis, and outbreak detection through Natural Language Processing (NLP), machine learning models and deep learning tools. However, in order to train AI to facilitate these medical fields, well-documented and accurate medical conversations are needed. The dataset presented covers a series of medical conversations in the format of Objective Structured Clinical Examinations (OSCE), with a focus on respiratory cases in audio format and corresponding text documents. These cases were simulated, recorded, transcribed, and manually corrected with the underlying aim of providing a comprehensive set of medical conversation data to the academic and industry community. Potential applications include speech recognition detection for speech-to-text errors, training NLP models to extract symptoms, detecting diseases, or for educational purposes, including training an avatar to converse with healthcare professional students as a standardized patient during clinical examinations. The application opportunities for the presented dataset are vast, given that this calibre of data is difficult to access and costly to develop.

Similar content being viewed by others

doctor visit data set

Challenges of developing a digital scribe to reduce clinical documentation burden

Juan C. Quiroz, Liliana Laranjo, … Enrico Coiera

doctor visit data set

The digital scribe in clinical practice: a scoping review and research agenda

Marieke M. van Buchem, Hileen Boosman, … Ewout W. Steyerberg

doctor visit data set

Assessing the accuracy of automatic speech recognition for psychotherapy

Adam S. Miner, Albert Haque, … Nigam H. Shah

Background & Summary

Artificial Intelligence (AI), including Natural Language Processing (NLP), Machine Learning (ML) models and deep learning tools, are playing an increasingly important role in medicine such as in education, diagnosis and disease classification. However, in order to train NLP models, robust and accurately documented medical conversations are needed. The presented medical conversation data is challenging to obtain, especially in the format of audio files with corresponding processed and transcribed text documents. This dataset can be utilized to benefit the greater community, including academia and the medical industry.

A team of resident doctors in internal medicine, physiatry, anatomical pathology and family medicine, and senior Canadian medical students created this dataset. The medical interviews were recorded in the format of Objective Structured Clinical Examinations (OSCE) 1 . 272 cases were simulated between the physician and the patient. These cases were recorded and classified into the categories of respiratory, musculoskeletal, cardiac, dermatological, and gastrointestinal diseases. However, the majority of simulations were respiratory cases. Please see Fig.  1 for a visual representation of the types of cases included. These audio recordings were then transcribed, manually corrected for speech to text errors, and an identifier was added to specify the speaker.

figure 1

Pie chart demonstrating the proportion of cases in the following categories: respiratory (78.7%, blue), musculoskeletal (16.9%, orange), gastrointestinal (2.2%, grey), cardiac (1.8%, red) and dermatological (0.4%, green).

Each component of the presented dataset can be used for various purposes. The audio recordings can be used to test the accuracy of transcription tools, and to detect speech-to-text errors. The manually corrected transcripts can be annotated with desired tags to build Named-Entity Recognition (NER) tools in order to train various NLP models. For example, the data can be used to train an NLP model to use avatars instead of the traditional standardized patient to converse with medical students for OSCEs. This has been explored by a study that investigated obtaining word embeddings from an NLP model trained on medical documents and a convolutional neural network (CNN) trained on Question-Answer (QA) systems 2 . However, their models only resulted in an accuracy of 81% in answer selection 2 . The presented dataset may help to increase the accuracy of such an educational model due to the nature of OSCE-simulated medical conversations, the rationale for chosen cases, and manual correction of speech-to-text errors and speaker identification.

A brief literature search demonstrated that Speech Recognition (SR) software studies in the past had shown error rates ranging from 7.4 to 65% 3 , 4 . However, SR is still necessary to reduce turnaround times and cost-effective reporting of patient-physician interviews 5 , 6 .

One study stated that recordings made in a controlled environment with speakers simulating a medical conversation while sitting directly in front of a microphone are best for high-quality audio 7 . However, even in these ideal conditions, using conversational speech to train SR software leads to errors due to speech that is not well-formed, disfluencies like false starts, extraneous information, pauses, repetitions, and interruptions 8 . It was also found that SR software trained with medical dictations leads to higher error rates compared to those trained with medical conversations because of the lack of punctuation and grammatical differences in spoken and written language 9 , 10 . In addition, the transcript produced lacks clear structure because of the natural flow of the conversation 11 so the transition from one speaker to the next may not be clear 12 , 13 . To help improve the accuracy of NLP models, the presented dataset countered these issues by producing high-quality audio, minimizing disfluencies, simulating medical conversations through the tested and tried OSCEs and identifying each speaker in the transcripts.

Lastly and most importantly, getting access to medical conversations is a major roadblock for many studies because of the confidential nature of the data 14 , 15 , government regulations limiting the sharing of data in research, and the issue of data being monetized 16 . Research has been done using large volumes of medical conversations 17 , 18 , but they are private and not shared due to industrial and research advantages since these datasets are costly to develop 19 . One of the few publicly available large-scale medical dialogue datasets is MedDialog which contains both a Chinese dataset with 3.4 million conversations and an English dataset with 0.26 million conversations covering 96 specialties 20 . The purpose of this dataset was to create medical dialogue systems to assist in telemedicine/online medical forums 20 . While this dataset is open to the public with a large volume, the data is in text format only, does not have a structured approach such as the OSCE and only some conversations conclude with a diagnosis which may have implications in training NLP models for the purposes discussed previously 20 . Additionally, these transcripts are predominately from online medical forums, and do not accurately represent live conversations. The Bristol Archive Project also created a dataset of 327 video-recorded primary care consultations and coded transcripts known as the “One in a million primary care consultations archive” for future research and teaching purposes 16 . This data can be accessed by researchers with ethics approval to develop medical and research training 16 . This dataset is similar to the presented dataset in terms of methodology and content and therefore, can likely be used in combination to increase the accuracy of NLP models 16 . However, this dataset was created exclusively based on the patient population of West England, therefore having implications for generalizability 16 . In summary, robust and accurate medical conversations are of utmost value, and the presented dataset can be a valuable asset to many in academia and the industry.

The methodology of developing this dataset can be broken down into the following components:

Recording of Simulated Medical Conversations

Cleaning of Audio

Manual Correction of Transcripts

Quality Control

A team of resident doctors in internal medicine, physiatry, anatomical pathology and family medicine, and senior Canadian medical students recorded simulated medical conversations in the format of Objective Structured Clinical Examinations (OSCE) on Microsoft Teams. Unlike traditional clinical exams, the OSCE is a practical and objective approach in the diagnosis and communication of medical conditions, and has the ability to handle unpredictable patient behaviour and seemingly unrelated symptoms 21 . It is often used as a standardized method to test students’ clinical skills.

Cases were divided into the following categories:

Respiratory cases (designated “RES”)

Musculoskeletal cases (designated “MSK”)

Cardiac cases (designated “CAR”)

Dermatological case (designated “DER”)

Gastrointestinal cases (designated “GAS”)

272 cases were simulated and recorded (please refer to Fig.  1 ). The focus of the dataset was respiratory cases (214 cases). In addition, 46 musculoskeletal cases, 5 cardiac cases, 6 gastrointestinal cases and 1 dermatology case were also simulated. Of the total simulated recordings, 57% of the cases involved a male physician and 43% involved a female physician. From the patient perspective, 55% of the simulated cases involved a male patient and 45% involved a female patient. The average duration of each conversation was 11 minutes and 56 seconds. For further details, please refer to Fig.  2 for a histogram of the number of cases corresponding to various lengths of time. The focus was on respiratory cases because most pandemics, including the COVID19 pandemic, are caused by droplet or airborne based respiratory diseases. Therefore, it is crucial to differentiate between a benign cause of malaise such as the common cold from a highly infectious and fatal cause such as COVID19 or Tuberculosis.

figure 2

Histograms displaying the number of conversations with their corresponding length of time in minutes (left) and number of words per conversation (right).

In deciding which medical conditions to simulate, two considerations were taken into account; the first being prevalence of the condition, and the second being mortality rate of the condition if left untreated. For example, in simulating respiratory conditions, a common infectious condition is the common cold, most often caused by rhinovirus 22 , whereas a fatal condition if left untreated is a pulmonary embolism 23 . The rationale for these considerations was that physicians are taught to recognize and treat common conditions and to not miss fatal conditions. However, some conditions that are not common or highly fatal were also included within the dataset to represent the diversity of cases seen in the clinic and hospital setting. In addition, COVID19 cases were included to reflect the landscape of current burden of disease in medicine.

Each case was simulated between the acting physician and the acting patient, both being senior medical students or resident doctors. The patient chose a case using the two considerations discussed previously to guide his/her decision, and answered questions posed by the physician. Medical students and resident doctors are not typically assessed on their competence at being a standardized patient. However, they have observed many trained standardized patients during assessed OSCEs and have a good perception of how patients respond in hospital/clinical settings, and they were prompted to answer questions posed by the physician as how patients would respond in a clinical/hospital setting ie. vague responses to open-ended questions and specific responses to direct questions. In addition, they were given the liberty of choosing the age and gender that they wanted to portray keeping in mind the demographic population that would normally present with the condition that they have chosen to portray.

The acting physician was told to take a history as they normally would in the hospital or clinic setting to help inform a differential diagnosis. While it was acknowledged that senior medical students and resident doctors will have slightly different competency levels, they were told to ask baseline questions including symptoms experienced, time of onset, location, severity, quality, associated symptoms, review of systems, past medical history, medication, family history, sexual history and social history including travel, sick contacts, employment, housing, alcohol consumption and recreational drug use. The physician was blinded to the final diagnosis to simulate the clinic and hospital setting and to avoid asking leading questions. Each case was concluded by the physician using information gathered on history taking in order to formulate a differential diagnosis and management plan. It is important to note that although these medical conversations were recorded in the format of OSCEs, the pressures of assessment and evaluation were not a component of these conversations.

The recorded medical conversations were uploaded to Audacity 3.0.2 ( www.audacityteam.org ), an open-source audio editing platform, to trim extraneous information, including patient/physician identifiers and any part of the conversation that was not organic. For example, case presentations in which the physician summarized patient age, gender and history of presenting complaints during which he/she was not directly speaking to the patient was trimmed out.

The recorded medical conversations were uploaded to the “Microsoft Stream” platform for transcription. These transcripts were then manually corrected for speech-to-text errors, including spelling mistakes, grammar mistakes, and incorrect punctuation. For example, a common error picked up in respiratory cases was the term “cough” which was often transcribed as “cost”. Key pieces of information were also added if missed during the speech-to-text transcription phase. For example, the speech-to-text software blacked out the term “sexual” when the physician inquired about sexual health and sexually transmitted infections. Therefore, this was added back to the transcript for completeness. In addition, the text file was manually reviewed to separate physician lines indicated by “D” for doctor and patient lines indicated by “P” in order to delineate the transition between speakers. Live editing occurred while simultaneously listening to the audio files to minimize errors. Table  1 demonstrates an example of part of a transcribed audio recording that was manually corrected.

Once the audio was cleaned and transcripts manually corrected by the initial reviewer, a team of two people reviewed the audio files and transcripts in order to ensure that the mistakes discussed in part b and c were not present. This was performed by simultaneously listening to the corresponding audio file while editing the transcript. The American version of English was used for the transcripts.

Data Records

The simulated medical conversation dataset is available on figshare.com 24 . The dataset is divided into two sets of files: audio files of the simulated conversations in mp3 format, and the transcripts of the audio files as text files. There are 272 mp3 audio files and 272 corresponding transcript text files. Each file is titled with three characters and four digits. RES stands for respiratory, GAS represents gastrointestinal, CAR is cardiovascular, MSK is musculoskeletal, DER is dermatological, and the four following digits represent the case number of the respective disease category.

Technical Validation

Using the Objective Structured Clinical Examination (OSCE) format for medical conversations facilitated objectivity, consistency, and organization. Medical conversations between resident doctors and medical students followed an overall format of elucidating the following pertinent information: symptoms and respective qualifiers (such as time of onset, location, severity, etc.), associated symptoms, review of systems, past medical history, medications, family history, social history, and other risk factors. During the manual correction of the transcript phase, key pieces of information were added if missed during the speech to text transcription phase, and corrected for spelling errors, grammar mistakes, and other inconsistencies. Speaker transition was also denoted. The audio and transcripts were again reviewed by exhaustively listening to all audio files while manually correcting each transcript after the initial processing of transcript to ensure the text accurately reflected what was said in the audio file. As discussed in Methods, the “physician” was blinded to the final diagnosis in order to simulate the clinic and hospital setting, and to avoid asking leading questions.

Usage Notes

The presented dataset can be utilized in many ways. The audio recordings can be used to test the accuracy and precision of transcription tools and speech recognition software. By extension, it can be used to detect and fix speech-to-text errors. The manually corrected transcripts can be annotated with desired tags to develop tools such as Named-Entity Recognition (NER) and train NLP models to build educational models. For example, it can be used to train an NLP model to use avatars to converse with medical students or other healthcare professional students for OSCEs by replacing the traditional standardized patient which can have cost and access implications for students and institutions. Overall, this comprehensive dataset can also be used to create an end-to-end system from symptom extraction to disease classification.

High-quality audio of medical conversations is difficult to simulate due to factors such as environment control and microphone position 7 . In addition, high-quality transcripts of medical conversations are difficult to access due to speech-to-text errors of SR software, including spelling errors, grammar mistakes, and disfluencies like false starts, extraneous information, pauses, repetitions and interruptions 8 . The transcribed file also often fails to indicate the transition between speakers 12 . In creating this dataset, special attention was given to all of these drawbacks in order to create a comprehensive dataset that is robust, accurate, easy to understand and applicable to train any NLP model. Most importantly, access to this calibre of data is a major challenge for many researchers because of the confidential nature of the data 14 , 15 , government regulations that limit data sharing in research, and the issue of data being monetized 16 . Therefore, the presented dataset of comprehensive medical conversations in audio and text formats is a valuable asset to academia and the medical industry.

While there are many benefits to this dataset, as aforementioned, there are limitations to using this data set to train NLP models. The first limitation is the small number of conversations of non-respiratory illnesses. It is important to note that although these medical conversations were recorded in the format of OSCEs, the pressures of assessment and evaluation were not a component of these conversations. This may have implications specifically if these conversations were to be used to train an NLP model to use avatars to converse with medical students or other healthcare professionals for OSCEs. However, as discussed in the methods section, the physician was instructed to ask questions as they would in the hospital or clinic setting and prompted to cover baseline topics as previously discussed. In addition, not having the pressures of a formal evaluation may serve as a benefit in simulating medical conversations as it could allow for more realistic dialogue encountered in the clinic/hospital setting. The patient was given the liberty to choose the age and gender that he/she wanted to portray based on the demographic population that would typically present with his/her chosen condition. This resulted in audio files of the medical student/resident doctor (who were in their twenties) with a voice that does not match an elderly patient if they have chosen to represent that population. This may have implications for its potential use in speech recognition detection for speech-to-text errors as the voice of an elderly patient may be different sounding than a younger patient and thus, may affect the ability/quality of the speech to text function. However, since the audio files are also converted into corrected manuscripts, this should not have any implications for training NLP models to extract symptoms, detect diseases, or for educational purposes, including training an avatar to converse with healthcare professional students as a standardized patient during clinical examinations. In addition, although the OSCE- styled medical conversations are superior to traditional clinical exams in terms of objectivity, precision, and ability to handle unpredictable patient behavior and seemingly unrelated symptoms, they are limited in their ability to simulate real-world patient-physician conversations, which are more complex due to subtle body language, facial cues and other non-verbal presentations. Thirdly, these medical conversations only covered the history-taking part of simulated medical visits. Physical exams were not included in the medical conversation and therefore, there may be limitations in informing a clinical differential diagnosis and management plan. This dataset has 3309 minutes of audio and 272 transcribed texts. Training AI models is data-intensive requiring large amounts of data 25 , 26 , 27 . Therefore, this dataset can be combined with other datasets for the purposes described previously. The user will have to take into consideration transferability and generalizability when combining such data. Lastly, this dataset focussed predominantly on respiratory cases so it does limit usage. However, as discussed previously, the team believed this topic was most relevant given the current burden of disease, particularly the COVID19 pandemic.

Code availability

Not applicable to this dataset.

Change history

26 may 2023.

The link to data citation in reference 24 was incorrect in the original version ( https://figshare.com/s/d83162fad67407081b32 ) and has been corrected to https://doi.org/10.6084/m9.figshare.c.5545842.v1 . The original article has been corrected.

Harden, R. M. What is an OSCE. Medical Teacher. 10 , 19–22 (1998).

Article   Google Scholar  

Zini, J.E., Rizk, Y., Awad, M. & Antoun, J. Towards A Deep Learning Question-Answering Specialized Chatbot for Objective Structured Clinical Examinations. IJCNN ). 1–9 (2019).

Zhou, L. et al . Analysis of Errors in Dictated Clinical Documents Assisted by Speech Recognition Software and Professional Transcriptionists. JAMA Netw Open. 1 , e180530 (2018).

Article   PubMed   PubMed Central   Google Scholar  

Kodish-Wachs, J., Agassi, E., Kenny, P. & Overhage, J. M. A systematic comparison of contemporary automatic speech recognition engines for conversational clinical speech. AMIA. 2018 , 683–689 (2018).

PubMed Central   Google Scholar  

Johnson, M. et al . A systematic review of speech recognition technology in health care. BMC Med Inform Decis Mak. 14 , 94 (2014).

Tobias, H. & Enrico, C. Risks and benefits of speech recognition for clinical documentation: a systematic review. JAMIA. 23 , e169–e179 (2016).

Google Scholar  

Quiroz, J. C. et al . Challenges of developing a digital scribe to reduce clinical documentation burden. NPJ digital medicine. 2 , 114 (2019).

Zayats, V. & Ostendorf, M. Giving attention to the unexpected: using prosody innovations in disfluency detection. Proc. 2019 Conference of the North American Chapter of the Association for Computational Linguistics: Human Language Technologies. 1 , 86–95 (2019).

Kahn, J. G., Lease, M., Charniak, E., Johnson, M. & Ostendorf, M. Effective use of prosody in parsing conversational speech. In Proc. Human Language Technology Conference and Conference on Empirical Methods in Natural Language Processing . 233–240 (2005).

Finley, G. et al . An automated medical scribe for documenting clinical encounters. In Proc. 2018 Conference of the North American Chapter of the Association for Computational Linguistics: Demonstrations . 11–15 (2018).

Lacson, R. C., Barzilay, R. & Long, W. J. Automatic analysis of medical dialogue in the home hemodialysis domain: structure induction and summarization. J. Biomed. Inform. 39 , 541–555 (2006).

Article   PubMed   Google Scholar  

Wachter, R. & Goldsmith, J. To combat physician burnout and improve care, fix the electronic health record. Harvard Bus. Rev . (2018).

Lacson, R. & Barzilay, R. Automatic processing of spoken dialogue in the home hemodialysis domain. AMIA . 420–424 (2005).

Du, N. et al . Extracting symptoms and their status from clinical conversations. In Proc. of the 57th Annual Meeting of the Association of Computational Linguistics , 915–925 (2019).

Cios, K. J. & William, M. G. Uniqueness of medical data mining. Artif. Intell. Med. 26 , 1–24 (2002).

Jepson, M. et al . The ‘One in a Million’ study: creating a database of UK primary care consultations. Br. J. Gen. Pr. 67 , e345–e351 (2017).

Rajkomar, A. et al . Automatically charting symptoms from patient-physician conversations using machine learning. JAMA Intern. Med. 179 , 836–838 (2019).

Shafey, L. E., Soltau, H. & Shafran, I. Joint speech recognition and speaker diarization via sequence transduction. In Interspeech . 396–400 (2019).

Liu, Z. et al . Fast prototyping a dialogue comprehension system for nurse-patient conversations on symptom monitoring. Proc. 2019 Conference of the North American Chapter of the Association for Computational Linguistics: Human Language Technologies. 2 , 24–31 (2019).

Zeng, G. et al . MedDialog: Large-scale Medical Dialogue Datasets. In EMNLP . 9241–9250 (2020).

Zayyan, M. Objective structured clinical examination: the assessment of choice. Oman Med J. 26 , 219–222 (2011).

Heikkinen, T. & Järvinen, A. The common cold. The Lancet. 361 , 51–59 (2003).

Bĕlohlávek, J., Dytrych, V. & Linhart, A. Pulmonary Embolism, Part I: Epidemiology, risk factors and risk stratification, pathophysiology, clinical presentation, diagnosis and nonthrombotic pulmonary embolism. Exp. Clin. Cardiol. 18 , 129–138 (2013).

PubMed   PubMed Central   Google Scholar  

Fareez, F. et al . A dataset of simulated patient-physician medical interviews with a focus on respiratory cases. Figshare https://doi.org/10.6084/m9.figshare.c.5545842.v1 (2022).

Chartrand, G. et al . Deep learning: a primer for radiologists. Radiographics. 37 , 2113–2131 (2017).

Hu, G., Peng, X., Yang, Y., Hospedales, T. M. & Verbeek, J. Frankenstein: Learning deep face representations using small data. IEEE Trans. Image Process. 27 , 293–303 (2018).

Article   ADS   MathSciNet   CAS   MATH   Google Scholar  

Chen, D. et al . Deep learning and alternative learning strategies for retrospective real-world clinical data. Npj Digit. Med. 2 , 43 (2019).

Download references

Author information

Authors and affiliations.

Western University, London, N6A 3K7, Canada

Faiha Fareez, Tishya Parikh, Christopher Wavell, Saba Shahab, Meghan Chevalier, Scott Good, Isabella De Blasi & Christopher W. Smith

Goodlabs Studio, Toronto, M5H 3E5, Canada

Faiha Fareez, Tishya Parikh, Christopher Wavell, Saba Shahab, Meghan Chevalier, Scott Good, Isabella De Blasi, Rafik Rhouma, Christopher McMahon, Jean-Paul Lam, Thomas Lo & Christopher W. Smith

Department of Economics, University of Waterloo, Waterloo, N2L 3G1, Canada

Rafik Rhouma, Christopher McMahon & Jean-Paul Lam

Polytechique Montreal, Montreal, H3T 1J4, Canada

Rafik Rhouma

You can also search for this author in PubMed   Google Scholar

Contributions

Faiha Fareez – First author of the manuscript. Created and recorded medical conversations with co-residents and medical students, and manually edited transcripts. Tishya Parikh- Created and recorded medical conversations with co-residents and medical students, manually edited transcripts, and edited and reviewed the transcript. Christopher Wavell- Created and recorded medical conversations with co-residents and medical students, manually edited transcripts, and edited and reviewed the transcript. Saba Shahab- Created and recorded medical conversations with co-residents and medical students, manually edited transcripts, and edited and reviewed the transcript. Meghan Chevalier- Created and recorded medical conversations with co-residents and medical students, manually edited transcripts, and edited and reviewed the transcript. Scott Good- Created and recorded medical conversations with co-residents and medical students, manually edited transcripts, and edited and reviewed the transcript. Isabella De Blasi- Created and recorded medical conversations with co-residents and medical students, manually edited transcripts, and edited and reviewed the transcript. Rafik Rhouma- Provided feedback and helped edit the manuscript. Christopher McMahon- Provided feedback and helped edit the manuscript. Jean-Paul Lam- Provided feedback and helped edit the manuscript. Thomas Lo- Provided feedback and helped edit the manuscript. Christopher Smith – Senior author and organizer of the project, oversaw the direction of the project/publication.

Corresponding author

Correspondence to Christopher W. Smith .

Ethics declarations

Competing interests.

The authors declare no competing interests.

Additional information

Publisher’s note Springer Nature remains neutral with regard to jurisdictional claims in published maps and institutional affiliations.

Rights and permissions

Open Access This article is licensed under a Creative Commons Attribution 4.0 International License, which permits use, sharing, adaptation, distribution and reproduction in any medium or format, as long as you give appropriate credit to the original author(s) and the source, provide a link to the Creative Commons license, and indicate if changes were made. The images or other third party material in this article are included in the article’s Creative Commons license, unless indicated otherwise in a credit line to the material. If material is not included in the article’s Creative Commons license and your intended use is not permitted by statutory regulation or exceeds the permitted use, you will need to obtain permission directly from the copyright holder. To view a copy of this license, visit http://creativecommons.org/licenses/by/4.0/ .

Reprints and permissions

About this article

Cite this article.

Fareez, F., Parikh, T., Wavell, C. et al. A dataset of simulated patient-physician medical interviews with a focus on respiratory cases. Sci Data 9 , 313 (2022). https://doi.org/10.1038/s41597-022-01423-1

Download citation

Received : 21 September 2021

Accepted : 25 May 2022

Published : 16 June 2022

DOI : https://doi.org/10.1038/s41597-022-01423-1

Share this article

Anyone you share the following link with will be able to read this content:

Sorry, a shareable link is not currently available for this article.

Provided by the Springer Nature SharedIt content-sharing initiative

Quick links

  • Explore articles by subject
  • Guide to authors
  • Editorial policies

Sign up for the Nature Briefing newsletter — what matters in science, free to your inbox daily.

doctor visit data set

You are offline. Trying to reconnect...

  • Warning : Invalid argument supplied for foreach() in /home/customer/www/opendatascience.com/public_html/wp-includes/nav-menu.php on line 95 Warning : array_merge(): Expected parameter 2 to be an array, null given in /home/customer/www/opendatascience.com/public_html/wp-includes/nav-menu.php on line 102
  • ODSC EUROPE
  • AI+ Training
  • Speak at ODSC

doctor visit data set

  • Data Engineering
  • Data Visualization
  • Deep Learning
  • Generative AI
  • Machine Learning
  • NLP and LLMs
  • Business & Use Cases
  • Career Advice
  • Write for us
  • ODSC Community Slack Channel
  • Upcoming Webinars

12 Notable Healthcare Datasets for 2022

12 Notable Healthcare Datasets for 2022

Modeling Datasets Healthcare posted by Elizabeth Wallace, ODSC January 18, 2022 Elizabeth Wallace, ODSC

Machines continue to show us how valuable they are to our everyday lives, and healthcare is no exception. However, finding quality healthcare data to train these machines can be a challenge. Luckily, researchers, governments, and even private companies recognize the value of providing (anonymized) data to advance healthcare initiatives and the public good. Here are 12 notable healthcare datasets for 2022. 

V7 COVID-19 X-Ray Dataset

You didn’t think we’d get out of this article without talking about Covid-19, did you? The Covid-19 X-Ray dataset offers more than 6000 annotated images of lungs with other characteristics removed. For example, traditional lung x-rays often show the shoulders or ribcage, which could help identify the age of the patient. Images include patients with and without Covid-19, and could help develop better tools for assessing the disease severity in individual patients.

Big Cities Health Inventory Data Platform

Big Cities Health Coalition upgraded its platform to include comparisons of key public health indicators across 28 cities. This collection contains more than 17,000 data points, and researchers can navigate through desired focus points with the navigation menu. Users gain greater insight into what’s impacting the U.S.’s big cities and can train machines accordingly.

Get your ODSC East 2024 pass today!

In-person and virtual conference.

April 23rd to 25th, 2024

Health Data

Governments are beginning to recognize the value of making datasets available to encourage innovation. This site offers high-value health data, including recent datasets for Covid-19, collected from the U.S. Department of Health and Human Services, as well as state partners. Researchers can explore 

Human Mortality Database

With information from 41 different countries, this dataset provides detailed mortality and population data. This type of data aids researchers and entrepreneurs in building solutions to improve life quality, address pressing chronic illness challenges, and manage or prevent environmental causes of shortened lifespans, among many other applications. The site now includes a new dataset—Short Term Mortality Fluctuations—for comparing responses to epidemics across 38 countries.

Open Access Series of Imaging Studies (OASIS) Brains Dataset

The latest release, OASIS-3, offers freely available datasets for researchers and citizen data scientists looking to explore advances in cognitive health, with images showcasing normal brain scans and those diagnosed with Alzheimer’s. It aims to improve clinical neuroscience initiatives and includes data across a broad demographic spectrum. Researchers can find thousands of images in the first, second, and now the third update. The datasets are free, but hopeful researchers must apply for use and sign the appropriate privacy agreements.

Child Health and Development Studies

The nonprofit Public Health Institute offers data on factors in early life. Researchers can access environmental, behavioral, genetic, and other biological data for participants. In many cases, these datasets cover decades of monitoring. The datasets offer a connection from these factors in early life to health outcomes later in adulthood. The datasets are free, but researchers must apply and sign agreements to access the data.

Data Discovery at the National Library of Medicine

The National Library of Medicine offers a variety of datasets from public health to drugs and supplements. These offer researchers data to explore in a variety of formats and over 130 different projects. Many of the datasets were updated last year. Researchers may use datasets for free but should follow the individual license agreements for each set.

National Cancer Institute SEER Data

The Surveillance, Epidemiology, and End Results Program offers population data by age, sex, race, year of diagnosis, and geographic areas. SEER releases new research data every spring and offers specialized datasets for researchers looking for something outside the available datasets. While these sets are free, researchers must apply for special access. There’s also an interactive toolbox to make the search for the right dataset easier.

Merck Molecular Health Activity Challenge

For drug discovery training sets, this dataset located on Kaggle offers datasets simulating how molecule sets interact with each other. The set also includes starter code in R for reading the datasets, and the benchmark result for several tasks is available as an example set. It offers 15 molecular datasets originally part of a Kaggle competition, and each belongs to a biologically relevant target.

Kent Ridge Biomedical Datasets

Located on the ELVIRA Biomedical Data Set Repository, this biomedical dataset collection focus on data published in journals such as Science and Nature. They offer high-dimensional sets, including gene expression, protein profiling data, and genomic sequence data. The list ranges from breast cancer sets to info in the central nervous system.

MedPix from the National Library of Medicine

This free database contains more medical images, teaching scenarios, teaching cases, and clinical topics. These attach to nearly 59,000 images by disease location, pathology category, and patient profiles. These images are indexed and curated, coming from over 12,000 patients. They are continually accepting new data submissions, and the images could offer valuable training options for computer vision, diagnostics, or other tools.

National Institute of Health X-Ray Dataset

This 100,000-plus strong image dataset lives on Kaggle and focuses specifically on chest x-rays. It includes over 30,000 unique patients and disease labels generated from NLP text-mining. These have an expected 90% accuracy rate. Researchers don’t have access to the original radiology reports, but interested parties can read the paper outlining the labeling process. Kaggle encourages other parties to offer other notations to update or correct erroneous labels.

Training New Healthcare Solutions

These data sets offer new choices for your healthcare solutions, whether you need data or images. As A.I. becomes a more significant part of healthcare solutions from beginning to end, expanding data set choices can provide the training. Be sure to check out the datasets from 2020 to find even more options for quality healthcare data.

Learn More about Healthcare AI and Healthcare Datasets at ODSC East 2024

So, I bet you’re ready to upskill your AI capabilities right? Well, if you want to get the most out of AI, you’ll want to attend ODSC East this April. At ODSC East, you’ll not only expand your AI knowledge and develop unique skills, but most importantly, you’ll build up the foundation you need to help future-proof your career through upskilling with AI.  Register now for 50% off all ticket types! 

doctor visit data set

Elizabeth Wallace, ODSC

Elizabeth is a Nashville-based freelance writer with a soft spot for startups. She spent 13 years teaching language in higher ed and now helps startups and other organizations explain - clearly - what it is they do. Connect with her on LinkedIn here: https://www.linkedin.com/in/elizabethawallace/

east discount square

A Practical Guide to RAG Pipeline Evaluation (Part 1: Retrieval)

Modeling posted by ODSC Community Apr 2, 2024 Retrieval-Augmented Generation, or RAG, has come a long way since the FAIR paper first introduced the concept in...

How to Evaluate Complex Gen AI Apps: A Granular Approach

How to Evaluate Complex Gen AI Apps: A Granular Approach

Generative AI posted by ODSC Community Apr 2, 2024 In our prior articles about RAG / LLM pipeline evaluation, we analyzed retrieval and generation as...

Unlocking the Power of Gen AI with Data Engineering

Unlocking the Power of Gen AI with Data Engineering

East 2024 Data Engineering posted by ODSC Community Apr 2, 2024 Editor’s note: Anindita Mahapatra is a speaker for ODSC East this April 23-25. Be sure to...

AI weekly square

Search code, repositories, users, issues, pull requests...

Provide feedback.

We read every piece of feedback, and take your input very seriously.

Saved searches

Use saved searches to filter your results more quickly.

To see all available qualifiers, see our documentation .

  • Notifications

This project aims to analyze the "Doctor Visit Analysis" dataset to uncover insights into patient behavior and healthcare trends. The dataset encompasses details from diverse doctor visits, such as patient gender, illness specifics, age, income, and private or non-private sector affiliation.

YSivaSaiSree/Doctor-Visit-Analytics

Folders and files, repository files navigation, doctor-visit-analytics.

  • Jupyter Notebook 100.0%

Australian Health Service Utilization Data

Description.

Cross-section data originating from the 1977–1978 Australian Health Survey.

A data frame containing 5,190 observations on 12 variables.

Number of doctor visits in past 2 weeks.

Factor indicating gender.

Age in years divided by 100.

Annual income in tens of thousands of dollars.

Number of illnesses in past 2 weeks.

Number of days of reduced activity in past 2 weeks due to illness or injury.

General health questionnaire score using Goldberg's method.

Factor. Does the individual have private health insurance?

Factor. Does the individual have free government health insurance due to low income?

Factor. Does the individual have free government health insurance due to old age, disability or veteran status?

Factor. Is there a chronic condition not limiting activity?

Factor. Is there a chronic condition limiting activity?

Journal of Applied Econometrics Data Archive.

http://qed.econ.queensu.ca/jae/1997-v12.3/mullahy/

Cameron, A.C. and Trivedi, P.K. (1986). Econometric Models Based on Count Data: Comparisons and Applications of Some Estimators and Tests. Journal of Applied Econometrics , 1 , 29–53.

Cameron, A.C. and Trivedi, P.K. (1998). Regression Analysis of Count Data . Cambridge: Cambridge University Press.

Mullahy, J. (1997). Heterogeneity, Excess Zeros, and the Structure of Count Data Models. Journal of Applied Econometrics , 12 , 337–350.

CameronTrivedi1998

What we do best

AI Data Services

Data Collection Create & collect audio, images, text & video from across the globe.

Data Annotation & Labeling Accurately annotate data to make AI & ML think faster & smarter.

Data Transcription AI-driven, cloud-based transcription supporting 150+ languages.

Healthcare AI Harness the power to transform complex data into actionable insight.

Conversational AI Localize AI-enabled speech models with rich structured multi-lingual datasets.

Computer Vision Train ML models with best-in-class AI data to make sense of the visual world.

Generative AI Harness the power to transform complex data into actionable insight.

  • Question & Answering Pairs
  • Text Summarization
  • LLM Data Evaluation
  • LLM Data Comparison
  • Synthetic Dialogue Creation
  • Image Summarization, Rating & Validation

Off-the-shelf Data Catalog & Licensing

Medical Datasets Gold standard, high-quality, de-identified healthcare data.

Physician Dictation Datasets

Transcribed Medical Records

Electronic health records (ehr).

CT Scan Images Datasets

X-Ray Images Datasets

Computer Vision Datasets Image and Video datasets to accelerate ML development.

Bank Statement Dataset

Damaged Car Image Dataset

Facial Recognition Datasets

Landmark Image Dataset

Pay Slips Dataset

Speech/Audio Datasets Source, transcribed & annotated speech data in over 50 languages.

New York English | TTS

Chinese Traditional | Utterance/Wake Word

Spanish (Mexico) | Call-Center

Canadian French | Scripted Monologue

Arabic | General Conversation

Banking & Finance Improve ML models to create a secure user experience.

Automotive Highly accurate training & validation data for Autonomous Vehicles.

eCommerce Improve shopping experience with AI to increase Conversion, Order Value, & Revenue.

Named Entity Recognition Unlock critical information in unstructured data with entity extraction in NLP.

Facial Recognition Auto-detect one or more human faces based on facial landmarks.

Search Queries Optimization Improving online store search results for better customer traffic.

Text-To-Speech (TTS) Enhance interactions with precise global language TTS datasets.

Content Moderation Services Power AI with data-driven content moderation & enjoy improved trust & brand reputation.

Optical Character Recognition (OCR) Optimize data digitization with high-quality OCR training data.

AI innovation in Healthcare

  • Healthcare AI

Medical Annotation

Data De-identification

Clinical Data Codification

Clinical NER

  • Generative AI

Off-the-Shelf Datasets

  • Events & Webinar
  • Security & Compliance
  • Buyer’s Guide
  • Infographics
  • In The Media
  • Sample Datasets

License High-quality Healthcare/Medical Data for AI & ML Models

Off-the-shelf Healthcare/Medical Datasets to jumpstart your Healthcare AI project

Medical And Healthcare Datasets

Plug-in the medical data you’ve been missing today

Medical and healthcare datasets for machine learning, physician dictation audio data.

Our de-identified dataset for healthcare include 31 different specialties audio files dictated by physicians describing patients’ clinical condition & plan of care based on physician-patient encounters in clinical setting.

Off-the-Shelf Physician Dictation Audio Files:

  • 257,977 hours of Real-world Physician Dictation Speech Dataset from 31 specialties’ to train Healthcare Speech models
  • Dictation audio captured from various devices like Telephone Dictation (54.3%), Digital Recorder (24.9%), Speech Mic (5.4%), Smart Phone (2.7%) and Unknown (12.7%)
  • PII Redacted Audio & Transcripts adhering to Safe Harbor Guidelines in conformance with HIPAA

Physician Dictation Audio Data

Transcribed medical records refers to transcription of physician & patient conversation, transcription of medical reports and medical assessment. It helps in mapping medical history of the patient for future visits and also acts as a refence point for the doctors. It helps evaluate the present condition of the patient and suggest a suitable treatment.

Off-the-Shelf Transcribed Medical Records:

  • Transcription of 257,977 hours of Real-world Physician Dictation from 31 specialties to train Healthcare Speech models
  • Transcribed Medical Records from various work types like Operative Report, Discharge Summary, Consultation Note, Admit Note, ED Note, Clinic Note, Radiology Report, etc.

Electronic Health Records (Ehr)

Electronic Health Records or EHR are medical records that contains patient’s medical history, diagnoses, prescription, treatment plans, vaccination or immunization dates, allergies, radiology images (CT Scan, MRI, X-Rays), and laboratory tests & more.

Off-the-Shelf Electronic Health Records (EHR):

  • 5.1M+ Records and physician audio files in 31 specialties
  • Real-world gold-standard medical records to train Clinical NLP and other Document AI models
  • Metadata information like MRN (Anonymized), Admission Date, Discharge Date, Length of Stay days, Gender, Patient Class, Payer, Financial Class, State, Discharge Disposition, Age, DRG, DRG Description, $ Reimbursement, AMLOS, GMLOS, Risk of mortality, Severity of illness, Grouper, Hospital Zip Code, etc.
  • Medical Records from various US states and region- North East (46%), South (9%), Midwest (3%), West (28%), Others (14%)
  • Medical Records belonging to all Patient Classes covered- Inpatient, Outpatient (Clinical, Rehab, Recurring, Surgical Day Care), Emergency.

Electronic Health Records (Ehr)

  • Medical Records belonging to all Patient Age Groups <10 yrs (7.9%), 11-20 yrs (5.7%), 21-30 yrs (10.9%), 31-40 yrs (11.7%), 41-50 yrs (10.4%), 51-60 yrs (13.8%), 61-70 yrs (16.1%), 71-80 yrs (13.3%), 81-90 yrs (7.8%), 90+ yrs (2.4%)
  • Patient Gender ratio of 46% (Male) and 54% (Female)
  • PII Redacted Documents adhering to Safe Harbor Guidelines in conformance with HIPAA

CT Scan Image Dataset

Doctors use CT scan image to diagnose and detect abnormal or normal conditions in a patient’s body. In the computerized image processing diagnosis, a CT-scan image goes through sophisticated phases, viz., acquisition, image enhancement, extraction of important features, Region of Interest (ROI) identification, result interpretation, etc.

Shaip provides high-quality CT scan image datasets essential for research and medical diagnosis. Our datasets include thousands of high-resolution images collected from real patients and processed with state-of-the-art techniques. These datasets are designed to help medical professionals and researchers improve their knowledge and understanding of various medical conditions, including cancer, neurological disorders, and cardiovascular diseases. 

Ct Scan Image Dataset

MRI Image Dataset

Computer vision models are designed to derive meaningful information from digital images and videos. It allows extensive use of healthcare image data to provide better diagnosis, treatment, and prediction of diseases. It can use context from the image sequence, texture, shape, and contour information, as well as past knowledge, to produce 3D and 4D information that aids in improved human understanding. Like CT scans, MRIs are also used to diagnose and detect abnormal or normal conditions in a patient’s body (i.e., to identify disease or injury within various body parts).

Shaip provides high-quality MRI image datasets from real patients and processed with state-of-the-art techniques.

Mri Image Dataset

X-Ray Image Dataset

X-ray testing is used to verify the internal structure and integrity of the object. X-ray images of a test object can be generated at different positions and different energy levels to diagnose and detect abnormal conditions in a patient’s body.

Shaip provides high-quality X-Ray image datasets essential for research and medical diagnosis. Our datasets include thousands of high-resolution images collected from real patients and processed with state-of-the-art techniques. With Shaip, you can access reliable and accurate medical data to enhance your research and improve patient outcomes.

X-Ray Image Dataset

Can’t find what you are looking for?

New off-the-shelf medical datasets are being collected across all data types .

Contact us now to let go of your healthcare training data collection worries

  • First Name *
  • Last Name *
  • Country * Select Country Afghanistan Argentina Australia Austria Bahamas Bahrain Bangladesh Belarus Belgium Bhutan Bolivia Botswana Brazil Brunei Darussalam Bulgaria Cambodia Canada Chile China Colombia Congo, Democratic Republic of the Croatia Czech Republic Denmark Dominican Republic Egypt Estonia Ethiopia Finland France Georgia Germany Ghana Greece Hong Kong Hungary Iceland India Indonesia Iran Iraq Ireland Israel Italy Jamaica Japan Jordan Kenya Kuwait Lebanon Lesotho Liberia Libya Madagascar Malawi Malaysia Maldives Mauritius Mexico Mongolia Morocco Mozambique Myanmar Nepal Netherlands New Zealand Nigeria North Korea Norway Oman Pakistan Paraguay Philippines Poland Portugal Qatar Romania Russia Saudi Arabia Singapore Slovakia Somalia South Africa South Korea Spain Sri Lanka Sweden Switzerland Taiwan Tanzania Thailand Turkey Uganda Ukraine United Arab Emirates United Kingdom United States Venezuela Vietnam Yemen Zambia Zimbabwe Other Country
  • By registering, I agree with Shaip Privacy Policy and Terms of Service and provide my consent to receive B2B marketing communication from Shaip.
  • Data Annotation
  • Data Collection
  • Data De-Identification
  • Conversational AI
  • Computer Vision
  • Automotive AI
  • Banking & Finance
  • ShaipCloud™ Platform

(US): (866) 473-5655

[email protected] [email protected] [email protected]

Vendor Enrollment Form

© 2018 – 2024 Shaip | All Rights Reserved

U.S. flag

An official website of the United States government

The .gov means it’s official. Federal government websites often end in .gov or .mil. Before sharing sensitive information, make sure you’re on a federal government site.

The site is secure. The https:// ensures that you are connecting to the official website and that any information you provide is encrypted and transmitted securely.

  • Publications
  • Account settings

Preview improvements coming to the PMC website in October 2024. Learn More or Try it out now .

  • Advanced Search
  • Journal List
  • HHS Author Manuscripts

Logo of nihpa

Life Satisfaction and Frequency of Doctor Visits

Eric s. kim.

a Department of Psychology, University of Michigan, Ann Arbor, MI 48109

Nansook Park

Jennifer k. sun.

c University of Michigan Medical School, Ann Arbor, MI 48109

Jacqui Smith

b Institute for Social Research, University of Michigan, Ann Arbor, MI 48109

Christopher Peterson

Associated data.

Identifying positive psychological factors that reduce health care use may lead to innovative efforts that help build a more sustainable and high quality health care system. Prospective studies indicate that life satisfaction is associated with good health behaviors, enhanced health, and longer life, but little information is available about the association between life satisfaction and health care use. We tested whether higher life satisfaction was prospectively associated with fewer doctor visits. We also examined potential interactions between life satisfaction and health behaviors.

Participants were 6,379 adults from the Health and Retirement Study, a prospective and nationally representative panel study of American adults over the age of 50. Participants were tracked for four years. We analyzed the data using a generalized linear model with a gamma distribution and log link.

Higher life satisfaction was associated with fewer doctor visits. On a six-point life satisfaction scale, each unit increase in life satisfaction was associated with an 11% decrease in doctor visits—after adjusting for sociodemographic factors (RR = 0.89, 95% CI = 0.86 to 0.93). The most satisfied respondents (N=1,121; 17.58%) made 44% fewer doctor visits than the least satisfied (N=182; 2.85%). The association between higher life satisfaction and reduced doctor visits remained even after adjusting for baseline health and a wide range of sociodemographic, psychosocial, and health-related covariates (RR = 0.96, 95% CI = 0.93 to 0.99).

Conclusions

Higher life satisfaction is associated with fewer doctor visits, which may have important implications for reducing health care costs.

Sustaining high quality health care is a growing concern for all and an issue prominently featured in national and international debates. In the United States, health care costs are approximately 16% of the country's gross domestic product and expected to almost double in the next decade, reaching 4.6 trillion dollars by 2020 ( 1 ). Despite spending more on health care than any other nation, people in the United States have inferior life expectancies and disease profiles, when compared against people in other developed nations ( 2 , 3 ). This disadvantage is not exclusively attributable to disadvantaged and poor Americans, because even very wealthy and educated Americans are in worse health than their counterparts in comparable countries ( 2 , 3 ).

A related issue is the increasing utilization of health care services. The aging population, which uses a continuously increasing number of doctor visits, puts a significant strain on the health care system. High use of health care services by certain segments of the population will not only lead to increased costs but also reduce the quality and timeliness of care for all. Health care use and associated costs have numerous contributors. Among these contributors, researchers have attempted to identify the factors that can be changed without sacrificing quality of care. Of interest have been administrative costs, medical technologies, prescription drugs, hospital care, and addressing physical illnesses and conditions that increase the use and costs of health care in an aging population ( 4 ). Psychological problems such as depression have also been examined ( 5 , 6 ). The common focus has been on factors associated with increased health care use, which translates into increased costs.

An alternative strategy is to identify factors associated with decreased health care use and costs. Possible candidates include positive psychological factors such as optimism, positive affect, purpose in life, and life satisfaction, each of which is strongly associated with better health ( 7 - 22 ). However, the link between such positive psychological factors and health care use is unknown. Given that the absence of psychological problems does not indicate the presence of psychological well-being ( 23 ), identifying positive psychological factors that reduce health care use may lead to innovative efforts that help build a sustainable and high quality health care system.

Using longitudinal data from the Health and Retirement Study ( 24 , 25 ), the present research examined the link between life satisfaction (sometimes called happiness) and health care use, measured by the number of doctor visits made among older US adults over a four-year period. Life satisfaction is an individual's overall judgment of how well his or her life has been lived. Life satisfaction foreshadows good health, including reduced risk of heart disease and increased longevity ( 18 - 20 ). We hypothesized that higher life satisfaction would predict fewer doctor visits, even when possible confounds such as age, gender, ethnicity, wealth, baseline number of chronic illnesses, frequency of past doctor visits, functional status, and an array of other factors were taken into account. To our knowledge, this is the first study to examine the association between life satisfaction and healthcare use.

Study Design and Sample

The Health and Retirement Study is an ongoing nationally representative panel study of US adults aged 50 and older, surveying over 22,000 Americans every two years since 1992. The HRS is sponsored by the National Institute on Aging (grant number NIA U01AG009740) and is conducted by the University of Michigan ( 24 , 25 ). Because this study used de-identified, publicly available data, the Institutional Review Board at the University of Michigan exempted it from review.

Starting in 2006, HRS began collecting psychosocial measures. That year, approximately 50% of HRS respondents were visited for an enhanced face-to-face interview. These respondents were also asked to complete a self-reported psychosocial questionnaire. The response rate was 90%. During the four-year follow-up, 789 out of 7,168 respondents passed away and were dropped from the analyses. The final sample consisted of 6,379 respondents. Sensitivity analyses were run to examine the impact of dropping deceased respondents. Whether the deceased were included or excluded, the relationship between life satisfaction and doctor visits remained significant. Therefore, data that excluded deceased respondents was used.

Life Satisfaction

In 2006, life satisfaction was assessed using the Satisfaction with Life Scale (SWLS) ( 26 ). Respondents were asked to rate each of five items on a 6-point Likert scale, indicating the degree to which they endorsed statements such as “In most ways my life is close to ideal” (1 = strongly disagree to 6 = strongly agree). All five items were averaged for a final score. The Satisfaction with Life Scale has excellent psychometric properties ( 26 ). The original SWLS uses a 7-point Likert scale, but several psychosocial scales in HRS were converted into 6-point scales to standardize and streamline measures. The average life satisfaction among respondents in the present study was 4.41 (SD = 1.19), and the reliability of the scale was excellent (α = 0.90).

Doctor Visits Measurement

In 2004, 2006, 2008, and 2010, respondents were asked: “Aside from any hospital stays or outpatient surgery, how many times have you seen or talked to a medical doctor about your health, including emergency room or clinic visits in the last two years?” The same HRS item has been used to measure the frequency of doctor visits in other studies ( 27 ). The validity of using self-report as an accurate estimate of doctor visits has been shown. Self-reported doctor visits shows substantial agreement with both administrative claims and medical records ( 28 - 31 ).

Doctor visit reports from the 2004 and 2006 waves were summed to cover a four-year period spanning from 2002 to 2006 (X = 18.15, SD = 23.29). Doctor visit reports from the 2008 and 2010 waves were summed to cover a four-year period spanning from 2006 to 2010 (X = 20.42, SD = 24.10). The 2006-2010 composite was the dependent measure in the present study, and the 2002-2006 composite was used as a baseline covariate.

Baseline Covariates

Baseline covariates were assessed in 2006. Potential confounds of the association between life satisfaction and frequency of doctor visits included relevant sociodemographic, psychosocial, and health-related risk factors.

Sociodemographic covariates included age, gender, race/ethnicity (European-American, African-American, Hispanic, other), marital status (married/not married), educational attainment (no degree, GED or high school diploma, college degree or higher), and total wealth (<25,000; 25,000-124,999; 125,000-299,999; 300,000-649,999; >650,000—based on quintiles of the score distribution in this sample).

Psychosocial covariates included both negative (anxiety and depression) and positive (optimism and social integration) psychosocial factors. Further information about these psychosocial measures can be found in the HRS Psychosocial Manual: http://hrsonline.isr.umich.edu/sitedocs/userg/HRS2006-2010SAQdoc.pdf .

Health-related covariates included smoking status (never, former, current), frequency of moderate (e.g., gardening, dancing, walking at a moderate pace) and vigorous exercise (e.g., running, swimming, aerobics) (never, 1-4 times per month, more than once a week), frequency of alcohol consumption (abstinent, less than 1 or 2 days per month, 1 to 2 days per week, and more than 3 days per week), number of previous doctor visits between 2002 and 2006 (as just explained), health insurance status (yes/no), functional status (number of activities of daily living performed with difficulty: bathing, eating, dressing), and an index of major chronic illnesses.

For the chronic illness index, self-report of a doctor's diagnosis of eight major medical conditions were recorded at baseline: ( 1 ) high blood pressure, ( 2 ) diabetes, ( 3 ) cancer or malignant tumor of any kind, ( 4 ) lung disease, ( 5 ) heart attack, coronary heart disease, angina, congestive heart failure, or other heart problems, ( 6 ) emotional, nervous, or psychiatric problems, ( 7 ) arthritis or rheumatism, and 8) stroke. Respondents reported an average of 1.96 (SD=1.37) conditions. Self-reported health measures used in HRS have been rigorously assessed for their validity and reliability ( 24 , 25 ).

Statistical Analyses

Because the number of doctor visits had a skewed distribution, we analyzed the data using a generalized linear model with a gamma distribution and log link, rather than using an ordinary least squares regression ( 32 ). The Box Cox, Modified Park, Pregibon's Link, and Hosmer-Lemeshow tests were used as diagnostic and goodness of fit measures. Due to the non-linear nature of the model, the estimated β coefficients were not directly interpretable. In order to obtain more easily interpretable results, the gamma coefficients created by the model were exponentiated into rate ratio (RR) estimates using the eform command in Stata (StataCorp. 2011. Stata Statistical Software: Release 12. College Station, TX: StataCorp LP). The model was weighted in Stata using HRS sampling weights to account for the complex multistage probability survey design (e.g., non-response, sample clustering, stratification, further post-stratification).

In order to simultaneously address the large number of covariates and avoid over-fitting the models, we examined the impact of the risk factors by creating a core model (Model 1) and then considered the impact of related covariates in turn. A total of four models were created. Model 1, the core model, adjusted for age, gender, race/ethnicity, marital status, education level, and total wealth; Model 2 – core model + psychosocial factors (depression, anxiety, optimism, and social integration); and Model 3 – core model + health related factors (smoking, exercise, alcohol use, number of previous doctor visits, insurance, index of major chronic illnesses, and functional status). Although doing so could overfit the model and raise multicollinearity issues, we also created a Model 4, which included all 17 covariates.

Prior research suggests that the association between life satisfaction and doctor visits varies by degree of engagement in various health behaviors, a potential effect modification. Therefore, we tested for potential interactions by creating interaction terms between life satisfaction and three health behaviors: smoking status, frequency of alcohol use, and frequency of exercise. All three health behaviors were dummy coded with abstention from an activity (e.g., no smoking, no alcohol intake, no exercise) as the reference group. The appropriate variables were mean centered in order to reduce multicollinearity problems. We also tested for a threshold effect. Life satisfaction scores were categorized into low (life satisfaction scores ranging from 1-3) and high (life satisfaction scores ranging from 4-6) based on the graph displaying the number of doctor visits as function of life satisfaction scores.

Finally, we examined if the association between life satisfaction and frequency of doctor visits differed if people suffered from a low versus high number of chronic diseases. Based on cutoff scores set by the U.S. Department of Health and Human Services Strategic Framework on Multiple Chronic Conditions, people were grouped into either a low chronic disease subgroup (zero or one chronic illness) or high chronic disease subgroup (two or more chronic illnesses).

Missing Data and Sensitivity Analyses

For all study variables, the overall item non-response rate was 1.71%. However, the missing data were scattered throughout the variables and resulted in a 25.69% loss of respondents when analyses with list-wise deletion was attempted. Therefore, to obtain less biased estimates, multiple imputation procedures were used to impute missing data. Sensitivity analyses showed that the results were significant before and after the implementation of multiple imputations. We therefore used the dataset with multiple imputation for all analyses reported here because this technique provides a more accurate estimate of association than other methods of handling missing data. Age, gender, race, index of chronic illnesses, marital status, total wealth, and functional status were used for the multiple imputation procedure. Further sensitivity analyses were conducted by excluding respondents at the two extremes of doctor visits—those who visited doctors the most (top 5%) and the least (bottom 5%). Again, results were significant whether the extreme respondents were included or excluded. Therefore, the imputed data without the exclusion of extremes was used.

Descriptive Statistics

The average age of respondents at baseline was 68 years (SD = 9.39). Respondents tended to be female (59%), European-American (78%), and married (66%). Most had a high school degree (55%) or attended some college (27%). Wealth varied, as would be expected. More than 95% had health insurance. About 43% of respondents had never smoked, and 42% were former smokers. Only 24% reported that they exercised at least weekly, whereas 61% reported that they never exercised. About 47% never consumed alcohol, and 18% did so no more than one day per week. Table 1 shows all of the descriptive statistics.

On a scale ranging from 1 to 6, the average life satisfaction score was 4.41 (SD = 1.19). Life satisfaction scores were distributed in the following manner: “1” (2.85%, n=182), “2” (6.37%, n=406), “3” (11.44%, n=730), “4” (23.05%, n=1,471), “5” (38.71%, n=2,469), and “6” (17.58%, n=1,121). Furthermore, Supplemental Table 1 shows the correlations among the continuous factors in the study and Supplemental Table 2 shows the distribution of life satisfaction scores among the categorical factors. Finally, in unadjusted analyses, the average number of doctor visits (per year) among people who reported the highest life satisfaction was 4.64 compared to an average 6.39 visits among people who reported the lowest life satisfaction ( Figure 1 ).

An external file that holds a picture, illustration, etc.
Object name is nihms727998f1.jpg

Life Satisfaction and the Number of Doctor Visits

People with higher life satisfaction made fewer doctor visits than those with lower life satisfaction (RR = 0.89, 95% CI = 0.86 to 0.93; p < .001) in the core model (Model 1, Table 2 ). Each unit increase in life satisfaction was associated with a 10.57% reduction in the number of reported doctor visits over the four-year follow-up. In this model, respondents reporting the highest life satisfaction made 44.16% fewer doctor visits than those reporting the lowest life satisfaction.

Even after adjusting for all 17 covariates (Model 4, Table 2 ), each unit increase in life satisfaction was associated with a 3.6% reduction in the number of reported doctor visits. In this fully-adjusted model, people reporting the highest life satisfaction made 18.46% fewer doctor visits than those reporting the lowest life satisfaction. As expected, older individuals made more visits (RR = 1.01, 95% CI = 1.00 to 1.01; p =.003), as did those with more chronic illnesses (RR = 1.12, 95% CI = 1.09 to 1.15; p < .001), those with higher depression (RR = 1.02, 95% CI = 1.00 to 1.04; p =.047), and those who visited doctors more frequently in the past (RR = 1.02, 95% CI = 1.01 to 1.02; p < .001). Also, African-American respondents (RR = 0.87, 95% CI = 0.80 to 0.95; p =.003) made fewer visits. None of the other covariates predicted doctor visit frequency, which does not mean that they are unimportant in and of themselves but that their independent effects were not apparent when other predictors were controlled in the model.

Interactions Between Life Satisfaction and Health Behaviors

To examine if the effects of life satisfaction on the number of doctor visits varied by degree of engagement in various health behaviors, we examined the interaction between life satisfaction and three behaviors: smoking status, frequency of alcohol use, and frequency of exercise. While the interaction term was not significant for smoking, the interaction term was significant for frequency of alcohol use and frequency of exercise.

The interaction term between life satisfaction and drinking frequency showed that among people who drank the most alcohol, the magnitude of the effect life satisfaction had on doctor visits was almost non-existent ( p =.035). However, all other subgroups of drinkers (including those who abstain from drinking) showed a strong and clear relationship between higher life satisfaction and fewer doctor visits ( Figure 2 ). Furthermore, higher life satisfaction was associated with fewer doctor visits only among the subgroup of adults who did not exercise ( Figure 3 ). For people who performed any amount of exercise, life satisfaction did not appear to be linked with frequency of doctor visits.

An external file that holds a picture, illustration, etc.
Object name is nihms727998f2.jpg

Subgroup Analyses by Level of Life Satisfaction

We tested for a possible threshold effect. Life satisfaction scores were categorized into low (scores ranging from 1-3) and high (scores ranging from 4-6) based on the graph displaying the number of doctor visits as function of life satisfaction scores ( Figure 1 ). Subgroup analyses showed that life satisfaction was associated with frequency of doctor visits only in the high life satisfaction group (scores ranging from 4-6; RR = 0.91, 95% CI = 0.87 to 0.95; p < .001 – Model 1, Supplemental Table 3 ). In contrast, life satisfaction was not associated with frequency of doctor visits in the low life satisfaction group (scores ranging from 1-3; RR = 1.03, 95% CI = 0.91 to 1.17; p = .62 – Model 1, Supplemental Table 3 ). Supplemental Table 3 shows results from all of the models for both subgroups.

Subgroup Analyses by High Versus Low Chronic Disease Status

Based on cutoff scores set by the U.S. Department of Health and Human Services, people were grouped into either a low (less than two chronic illnesses) or high chronic disease subgroup (two or more chronic illnesses). The same four covariate models that were used in our main analyses were used in these analyses. However, the index of chronic illnesses was removed from the list of covariates because it was the variable we were dividing the sample by.

While the association between life satisfaction and doctor visits remained significant in all models for the low chronic disease subgroup, the association was insignificant in the fully adjusted model for the high chronic disease subgroup (RR = 0.97, 95% CI = 0.94 to 1.00; p = 0.079, Supplemental Model 4, Table 4 ). Furthermore, in every model, the magnitude of the association between life satisfaction and doctor visits was larger in the low chronic disease subgroup when compared against the high chronic disease subgroup. Supplemental Table 4 shows results from all of the models for both subgroups. In an exploratory manner, we attempted different cutoff points and found that as we increased the number of chronic diseases required to place into the high chronic disease subgroup, the association between life satisfaction and doctor visits decreased in a monotonic fashion (data not shown).

The current study investigated the prospective association between life satisfaction and frequency of doctor visits in a nationally representative sample of older U.S. adults. People with higher life satisfaction made fewer doctor visits than their less satisfied counterparts, even after controlling for likely confounds, including baseline number of chronic illnesses, frequency of past doctor visits, functional status, and an array of other possible factors. The association between life satisfaction and doctor visits held regardless of how healthy or ill respondents were at baseline. To our knowledge, this is the first study to examine the association between life satisfaction and health care use.

Why is life satisfaction linked with fewer doctor visits among older US adults? Satisfied adults may visit doctors less frequently because they are healthier, the result of being more socially engaged and supported, having more realistic goals, possessing a greater sense of meaning and purpose, having better health-promoting habits, and managing their existing health problems more effectively ( 33 , 34 ). For example, at baseline in 2006, satisfied respondents were more likely to report psychological, behavioral, and social resources associated with well-being, such as higher optimism ( r = .34, p < .001), lower depression ( r = -.37, p < .001), greater social integration ( r = .12, p < .001, better functional status ( r = .13, p < .001), more wealth ( r = .24, p < .001), and intact marriages (r = .17, p < .001); they were also more likely to exercise ( r = .13, p < .001) and less likely to smoke ( r = -.09, p < .001). We controlled for these covariates in predicting subsequent doctor visits from baseline life satisfaction, meaning that they did not fully explain why those who are satisfied visit doctors less frequently. However, the pattern of cross-sectional correlations at baseline underscores the variety of assets to which life satisfaction might be linked. Also, dissatisfied adults may visit doctors more frequently not only because they are less healthy, but also because they are inappropriately worried about their health status, leading to overtreatment and wasteful care ( 35 ). For example, in 2006, less satisfied respondents reported higher levels of general anxiety ( r = -.31, p < .001).

To examine if the association between life satisfaction and frequency of doctor visits varied by degree of engagement in various health behaviors, we examined the interaction between life satisfaction and three health behaviors: smoking status, frequency of alcohol use, and frequency of exercise. While the interaction term was not significant for smoking, the interaction term was significant for frequency of alcohol use and frequency of exercise.

Interaction analyses revealed that the relationship between life satisfaction and doctor visits was weaker in certain subgroups. For example, the association between life satisfaction and doctor visits was almost non-existent among people who drank alcohol excessively. A possible reason for this finding is that older adults who drink frequently tend to ignore symptoms of disease, visiting doctors only when the disease moves onto advanced stages ( 36 ). However, besides the subgroup of people who drank alcohol excessively, a strong and clear relationship existed between higher life satisfaction and fewer doctor visits. Furthermore, in line with past research, people who abstained from drinking visited doctors the most frequently and people who drank the most alcohol visited doctors the least frequently ( 37 , 38 ). Another subgroup where the relationship between life satisfaction and doctor visits did not exist was in people who performed any type of exercise. On the other hand, higher life satisfaction was associated with fewer doctor visits among adults that did not exercise. This observation should be further examined in future research.

Another set of analyses revealed that life satisfaction was associated with frequency of doctor visits only in the high life satisfaction group (scores ranging from 4-6) and not in the low life satisfaction group (scores ranging from 1-3). This result was initially counter-intuitive because past research shows that psychological problems, such as depression, are linked with increased health care use ( 5 , 6 ). Therefore, we expected that the association between increased life satisfaction and reduced doctor visits would be particularly strong in the low life satisfaction group because we hypothesized that each unit increase in life satisfaction would be more beneficial for people who have lower levels of life satisfaction.

The absence of psychological problems, however, does not indicate the presence of psychological well-being ( 23 ). In this sample, the correlation between life satisfaction and depression was moderate ( r = -.37, p < .001). Furthermore, positive psychological factors and psychological problems show distinct biological correlates ( 23 ). Recent research has also shown that positive psychological factors (e.g., optimism, positive affect, and life satisfaction) show strong and unique relationships with enhanced health, even after adjusting for an array of risk factors and psychological problems ( 7 - 22 ). These findings help explain why life satisfaction was associated with frequency of doctor visits only in the high life satisfaction group and not in the low life satisfaction group. Currently, researchers suggest that alleviating psychological problems will reduce health care use. This particular finding suggests that above and beyond alleviating psychological problems, boosting psychological well-being will also reduce health care use.

Further subgroup analyses were conducted. Based on cutoff scores set by the U.S. Department of Health and Human Services, people were grouped into either a low or high chronic disease subgroup. Analyses revealed that in every model, the magnitude of the association between life satisfaction and doctor visits was larger in the low chronic disease subgroup when compared against people in the high chronic disease subgroup. In an exploratory manner, we also tried different cutoff points and found that as we increased the number of chronic illnesses required to place into the high chronic disease subgroup, the association between life satisfaction and doctor visits decreased. These preliminary findings suggest that sicker patients will continue seeking appropriate medical care irrespective of their level of life satisfaction. Therefore, it does not appear like people skip necessary doctor visits simply because they have high life satisfaction. However, these are preliminary results and more research is needed in this area. Future research should pinpoint the subgroups of people who do and do not show an association between life satisfaction and doctor visits.

Two sets of analyses in this study may appear contradictory, but they are not. In one set of analyses, the results showed that the association between life satisfaction and doctor visits was significant even after adjusting for the baseline number of chronic illnesses. In a later set of subgroup analyses, the association between life satisfaction and doctor visits was stronger among people with zero or one chronic illness (when compared against people with two or more chronic illnesses). Though these two analyses can be perceived as contradictory, showing that an association remains significant even after controlling for a covariate, does not preclude the possibility of an interaction effect (which the subgroup analyses revealed).

This study has limitations. Due to the structure of HRS data, we were unable to examine the types of doctor visits nor the possibility that individuals with higher life satisfaction made fewer but more expensive doctor visits. This possibility seems unlikely, given that higher life satisfaction has been linked with an array of health promoting habits and better health outcomes ( 18 - 20 , 33 ). Further studies examining the types of doctor visits as a function of life satisfaction are needed. Also, doctor visits were assessed using self-reported data. The validity of self-reported doctor visits has been supported in past research—the data show high agreement between self-reported doctor visits and medical records/administrative claims ( 28 - 31 ). Still, further research that uses objective measures of doctor visits would fine-tune the present findings.

Furthermore, each unit increase in life satisfaction was associated with a 10.57% decrease in doctor visits in the core model; in the fully adjusted model, which controlled for all 17 covariates, the decrease was 3.6%. Despite the seemingly small effect size, these findings should not be overlooked for three reasons. First, the analyses showed that life satisfaction was associated with fewer doctor visits even after controlling for an extensive array of covariates. Given the large number of covariates, the results most likely underestimate the true association between life satisfaction and doctor visits because several of the covariates may serve as mechanisms by which life satisfaction is associated with doctor visits. For example, people with higher life satisfaction may visit the doctor less frequently, because they engage in better health behaviors (e.g., people with higher satisfaction smoke less, exercise more, drink alcohol in moderation), yet we controlled for these variables ( 33 ).

Second, there are many cases where small effect sizes translate into meaningful outcomes. Therefore, it is important to interpret findings based on how constructs operate in the real world and also examine if these effects are cumulative across a person's lifespan ( 39 , 40 ). The effects of life satisfaction on doctor visits is cumulative when we consider how life satisfaction can potentially reduce the number of doctor visits over an individual's lifetime, and when the number of doctor visits is aggregated at the population level.

Third, the medical literature provides numerous examples of interventions that show only a modest association with improved health, but save many lives at the population level—which lead to significant changes in the standard of care. For example, although the association between aspirin consumption and the risk of myocardial infarction is only -.03 (smaller than the associations found in this study), one representative study found that aspirin consumption resulted in 85 fewer myocardial infarctions among the patients of 10,845 physicians ( 41 ). Aspirin is now routinely recommended for people at high risk for heart attacks. Similarly, even small increases in life satisfaction may yield substantial population-level decreases in healthcare use and free up doctors so that they can care for those most in need of immediate medical attention.

Life satisfaction can and does change ( 42 , 43 ), and several small-scale interventions have been shown to reliably and meaningfully raise it (e.g. 44-46). More research, however, needs to examine if interventions that raise life satisfaction can be feasibly disseminated widely at the population level. Based on the mounting observational evidence linking different facets of psychological well-being with enhanced physical health, researchers have conducted randomized controlled trials that aimed to induce positive affect (a distinct but overlapping factor). Results from these studies show that inducing positive affect can successfully change crucial health behaviors, such as increased medication adherence and physical activity, in chronic disease patients ( 47 - 49 ). However, these studies did not assess if these enhanced health behaviors led to decreased doctor visits.

Successfully addressing the problem of rising medical costs will likely require a multi-faceted approach. We hope the preliminary findings in this study, combined with past studies linking life satisfaction with enhanced health behaviors, improved health, and longer life, will spark new conversations about the possible links between psychological well-being and healthcare use. Further investigation in this line of research may reveal innovative ways of containing healthcare costs.

Supplementary Material

Supplementary table 1.

Supplementary Table 1: Correlation Table of All Continuous Variables

Supplementary Table 2

Supplementary Table 2. Distribution of Life Satisfaction Scores Among All Categorical Variables

Supplementary Table 3

Supplementary Table 3: Association Between Life Satisfaction and Doctor Visits By Subgroup

Supplementary Table 4

Supplementary Table 4: Association Between Life Satisfaction and Doctor Visits By Chronic Disease Status

Acknowledgments

We would like to thank Brady T. West and Richard Gonzalez for their advice on statistical analyses. We also thank Susanne Wurm, Clemens Tesch-Romer, and members of the German Centre of Gerontology for their helpful comments and suggestions. We also acknowledge the Health and Retirement Study, which is conducted by the Institute for Social Research at the University of Michigan, with grants from the National Institute on Aging (U01AG09740) and the Social Security Administration. Finally, we would like to thank the editor, associate editor, and the anonymous reviewers for their valuable comments and suggestions.

Sources of Funding: Support for this publication is provided by the Robert Wood Johnson Foundation's Pioneer Portfolio, which supports innovative ideas that may lead to breakthroughs in the future of health and health care. The Pioneer Portfolio funding was administered through a Positive Health grant to the Positive Psychology Center of the University of Pennsylvania, Martin Seligman, director.

Role of the Sponsor: The funding sources had no influence on the design or conduct of the study; collection, management, analysis or interpretation of the data; or preparation, review, or approval of the manuscript. Eric S. Kim had full access to all data in the study and takes responsibility for the integrity of the data and the accuracy of the data analysis. All authors contributed to the design of the study and interpretation of the findings, and have read, commented on, and approved the manuscript.

Disclosures: There are no conflicts of interest.

An official website of the Department of Health & Human Services

  • Search All AHRQ Sites
  • Español
  • Email Updates

Visit the AHRQ Data Tools to explore AHRQ data resources to view and analyze statistics on all aspects of healthcare in the United States.

dvisits: Doctor visits in Australia

Description.

The data come from the Australian Health Survey of 1977-78 and consist of 5190 single adults where young and old have been oversampled.

A data frame with 5190 observations on the following 19 variables.

1 if female, 0 if male

Age in years divided by 100 (measured as mid-point of 10 age groups from 15-19 years to 65-69 with 70 or more coded treated as 72)

age squared

Annual income in Australian dollars divided by 1000 (measured as mid-point of coded ranges Nil, less than 200, 200-1000, 1001-, 2001-, 3001-, 4001-, 5001-, 6001-, 7001-, 8001-10000, 10001-12000, 12001-14000, with 14001- treated as 15000

1 if covered by private health insurance fund for private patient in public hospital (with doctor of choice), 0 otherwise

1 if covered by government because low income, recent immigrant, unemployed, 0 otherwise

1 if covered free by government because of old-age or disability pension, or because invalid veteran or family of deceased veteran, 0 otherwise

Number of illnesses in past 2 weeks with 5 or more coded as 5

Number of days of reduced activity in past two weeks due to illness or injury

General health questionnaire score using Goldberg's method. High score indicates bad health

1 if chronic condition(s) but not limited in activity, 0 otherwise

1 if chronic condition(s) and limited in activity, 0 otherwise

Number of consultations with a doctor or specian the past 2 weeks

Number of consultations with non-doctor health professionals (chemist, optician, physiotherapist, social worker, district community nurse, chiropodist or chiropractor) in the past 2 weeks

Number of admissions to a hospital, psychiatric hospital, nursing or convalescent home in the past 12 months (up to 5 or more admissions which is coded as 5)

Number of nights in a hospital, etc. during most recent admission: taken, where appropriate, as the mid-point of the intervals 1, 2, 3, 4, 5, 6, 7, 8-14, 15-30, 31-60, 61-79 with 80 or more admissions coded as 80. If no admission in past 12 months then equals zero

Total number of prescribed and nonprescribed medications used in past 2 days

Total number of prescribed medications used in past 2 days

Total number of nonprescribed medications used in past 2 days

Sexually transmitted infection rates have risen sharply among adults 55 and older, CDC data shows

A rapid test at the sexual health clinic.

Sexually transmitted infections are becoming more common in older adults.

Rates of chlamydia, gonorrhea and syphilis in people ages 55 and up more than doubled in the U.S. over the 10-year period from 2012 to 2022, according to data from the Centers for Disease Control and Prevention.

The number of syphilis cases among people ages 55 and up increased seven-fold during those 10 years, while gonorrhea cases increased nearly five-fold and chlamydia cases more than tripled during that time. 

A presentation to be delivered Thursday — part of a lead-up event to the European Congress of Clinical Microbiology and Infectious Diseases next month — warns that both doctors and older adults are overlooking the risks of STIs in this age group. 

“We talk about smoking, we talk about diet, exercise, so many things, and not about sex at all,” said Justyna Kowalska, the author of the presentation and a professor of medicine at the Medical University of Warsaw. 

The issue is not limited to the U.S. In England, surveillance data published in 2022 suggested that STI diagnoses rose 22% from 2014 to 2019 among people ages 45 and up. Chlamydia was the most common, followed by gonorrhea. 

Kowalska pointed to a few factors that may be driving up STI rates among older adults.

For one, people are living longer compared to past generations and enjoying more active lifestyles in their 60s, 70s and 80s. For many, that includes sex. A 2018 survey from AARP and the University of Michigan estimated that 40% of people ages 65 to 80 are sexually active, and nearly two-thirds are interested in sex. 

Hormone replacement therapy, which can treat symptoms of menopause, can prolong sexual desire in older women, while erectile dysfunction drugs like Viagra can help older men remain sexually active.

But older adults may not have gotten the type of sex education provided to teenagers today, according to Matthew Lee Smith, an associate professor at the Texas A&M School of Public Health.

"Back in the '30s, the '40s, the '50s, traditional school wasn’t really doing sexual education very formally," said Smith, who studies behavioral health risks in older adults.

Smith's research has shown that older adults lack some knowledge about STI transmission, symptoms and prevention.

He said doctors can be sheepish about asking older patients about their sexual activity, and older people often aren’t inclined to discuss their sex lives with peers or family members.

“No one wants to think about grandma doing this,” Smith said. “You certainly aren’t going to ask grandma if she was wearing condoms — and that’s part of the problem, because every individual regardless of age has the right to intimacy.”

Some older men may struggle with condom use, Smith said, because of either a lack of dexterity or erectile dysfunction.

What's more, he added, many older adults married at a younger age than is typical now and only had one sexual partner until they divorced or were widowed. So some might not think to use a condom, Smith said — especially since pregnancy isn’t a concern. 

Nursing homes also create opportunities for new sexual partners. The results of a U.S. survey of nursing home directors, published in 2016, found that sexual activity was common in these settings, which often have more female than male residents.

“In the heterosexual, older adult community, there’s a partner gap: Women live longer than men and there’s a larger proportion of females to men,” Smith said. “What it can lead to oftentimes is multiple partners and sharing of partners.”

Though STIs pose health risks to all age groups, older people may have a harder time clearing infections or be more susceptible to contracting them in the first place, medical experts said.

“The immune system is weaker, so you can get an infection easier, but there’s other physical things related to just sexual intimacy that make one more susceptible,” said Ethan Morgan, an assistant professor of epidemiology at The Ohio State University College of Nursing. Among women who are postmenopausal, for instance, the vaginal lining is more prone to tearing, which makes it easier for an infection to occur.

The experts stressed that doctors need to do a better job of discussing safe sex with older patients.

“We want them to have their best life," Smith said, "but we want them to have it safely."

doctor visit data set

Aria Bendix is the breaking health reporter for NBC News Digital.

Middle East latest: Israeli PM Netanyahu should be removed now, predecessor says - as charities pause aid to Gaza following deadly strike on convoy

Benjamin Netanyahu has "failed" the country and his handling of the war has been "outrageous", a former Israeli prime minister has said. It comes after seven aid workers, including three Britons, were killed in an IDF strike on their convoy.

Tuesday 2 April 2024 23:37, UK

  • Israel-Hamas war

Please use Chrome browser for a more accessible video player

  • Two charities have paused their aid to Gaza
  • Charity names aid workers killed in Israeli airstrike
  • Netanyahu: Israel 'deeply regrets tragic incident'
  • Ex-Israeli PM says Netanyahu should be 'removed immediately'
  • Ships carrying 240 tonnes of undelivered aid to turn back from Gaza
  • Alistair Bunkall: Israel's admission will not stop foreign leaders demanding answers
  • Podcast: Will volunteers leave Gaza after aid deaths?
  • Watch: Evidence suggests three separate strikes

The World Central Kitchen (WCK) has named all seven of the volunteers who were killed on Monday after a convoy they were travelling in was hit as it was leaving a warehouse in Deir al Balah overnight.

Three of those were Britons who all worked on the charity's security team. 

They have been named as:

  • John Chapman, 57
  • James (Jim) Henderson, 33
  • James Kirby, 47

The other four, included nationals from Poland and Australia as well as a dual citizen of the US and Canada, and a Palestinian who was driving the car they were all travelling in.

They have been named by the charity as:

  • Saifeddin Issam Ayad Abutaha, 25, Palestine
  • Lalzawmi (Zomi) Frankcom, 43, Australia
  • Damian Sobol, 35, Poland
  • Jacob Flickinger, 33, USA & Canada 

Erin Gore, chief executive of WCK, said: "These are the heroes of World Central Kitchen. These seven beautiful souls were killed by the IDF in a strike as they were returning from a full day's mission."

She said their smiles, laughter and voices will forever be embedded in memories. 

"We are reeling from our loss. The world's loss," she added.

Two charities have paused providing aid to Gaza following the news that seven aid workers for World Central Kitchen were killed in an IDF airstrike.

Anera and Project Hope have both issued statements saying they are stopping their services in the territory over safety fears.

Anera said "the escalating risks associated with aid delivery leave us with no choice but to halt operations until our staff regain confidence that they can do their work without undue risk".

Project Hope stated: "We have paused all programming in Deir al Balah and Rafah for the next three days in solidarity with World Central Kitchen and to reassess the security situation as we prioritize our staff members’ safety."

Nearly 200 aid workers have died in Gaza since the war began in October.

Two of three British aid workers who died in an airstrike in Gaza have been reportedly named as John Chapman and James Henderson.

The World Central Kitchen (WCK) volunteers are believed to be among seven aid workers killed on Monday after a convoy they were travelling in was hit as it was leaving a warehouse in Deir al Balah overnight.

Documents seen by Sky News suggest Mr Chapman had been due to leave the Palestinian territory on 1 April.

The US has said that it was not involved in the Israeli airstrike on Iran's embassy compound in Damascus that killed two Iranian generals and five military advisers.

The White House national security spokesman John Kirby dismissed claims from Iran that Washington had some responsibility for the attack as "nonsense".

"Let me make it clear. We had nothing to do with the strike in Damascus," he told a briefing. 

"We weren't involved in any way."

Pentagon spokesperson Sabrina Singh said Israel provided no advance warning of the strike in the Syrian capital.

"We were not notified by the Israelis about their strike or the intended target of their strike in Damascus," she said.

Shortly before the attack, Israel notified the US that it would be operating in Syria, but t did not identify a target, two officials said on condition of anonymity.

Geolocated images show three vehicles appearing in three different locations across a distance of around 2.4km (1.5 miles) in Deir al Balah. 

The World Central Kitchen says its team was travelling in a three-car convoy and its movements had been coordinated with the Israeli army.

The incident "wiped out the operations team" of a major aid organisation, which is helping to feed half a million people in Gaza, according to a boss of the World Food Programme.

Sky News' Data and Forensics Unit looks at what we know so far about what happened.

Hundreds of anti-government protesters have gathered in front of the Knesset in Jerusalem for the third day in a row.

They are calling for the immediate release of hostages, the resignation of Benjamin Netanyahu and a general election.

Images showed people marching in the streets holding up placards reading "bring them home" and "stop the war".

Hundreds of tents have been set up in front of the Knesset, along with photos of those still held hostage after the 7 October attack by Hamas.

Antony Blinken extended his condolences to the friends and families of the aid workers who lost their lives in the IDF strike in Gaza.

Speaking in Paris, the US secretary of state said that World Central Kitchen had been doing "extraordinary and critical work" in Gaza and around the world.

He said: "The victims of yesterday's strike join a record number of humanitarian workers who have been killed in this particular conflict.

"These people are heroes, they show the best of what humanity has to offer when the going gets tough. They have to be protected.

"We shouldn't have a situation where people who are simply trying to help their fellow human beings are, themselves, at grave risk.

"We spoke directly to the Israeli government about this incident and urged a swift and impartial investigation to understand exactly what happened."

He added that the US and France had pressed on Israel the need to ensure civilians, both Gazans and aid workers, were not caught up in the crossfire of the war.

Mr Blinken has arrived in the French capital ahead of heading to Brussels for a NATO ministerial meeting tomorrow. 

Downing Street has just confirmed that Rishi Sunak spoke to Benjamin Netanyahu this evening.

According to a spokesperson, Mr Sunak said "he was appalled by the killing of aid workers, including three British nationals, in an airstrike in Gaza yesterday and demanded a thorough and transparent independent investigation into what happened".

They added: "The prime minister said far too many aid workers and ordinary civilians have lost their lives in Gaza and the situation is increasingly intolerable."

You can read more of the statement and other Westminster reaction in our  Politics Hub

Antonio Guterres has condemned an attack on Iran's diplomatic premises in Damascus, calling on "all concerned to exercise utmost restraint and avoid further escalation", his spokesperson said. 

UN spokesman Stephane Dujarric said the secretary-general "cautions that any miscalculation could lead to broader conflict in an already volatile region, with devastating consequences for civilians who are already seeing unprecedented suffering in Syria, Lebanon, the Occupied Palestinian Territory, and the broader Middle East".

Iran has blamed Israel for the deadly strike, in which two of its senior military commanders were killed along with five officers.

Hossein Akbari, Tehran's ambassador to Damascus, who was not injured in the strike, promised the Iranian response would be "harsh".

In this video, our  military analyst Sean Bell  explains what we know so far about the IDF strike that killed seven aid workers, including three Britons.

The evidence suggests, he says, that the vehicles were hit separately and were spread out over about two and a half kilometres as they returned home from their shift.

This indicates three separate strikes, rather than a single strike as the Israeli prime minister has implied.

Watch Bell's full breakdown of events...

Be the first to get Breaking News

Install the Sky News app for free

doctor visit data set

IMAGES

  1. Doctor Visit Log Medical Appointments Tracker Doctor Visit

    doctor visit data set

  2. Online Doctors Appointment

    doctor visit data set

  3. Doctors Visits Tracker Printable Doctors Appointments Log

    doctor visit data set

  4. How To Maintain Record Dr Visit In Hospital Checked Patient In Excel

    doctor visit data set

  5. Doctor Visit Record Sheet Templates for Word

    doctor visit data set

  6. Doctor Visit form Template Unique Doctor S Visit Chart Printable

    doctor visit data set

VIDEO

  1. Top 10 Pakistani YouTuber in 2023#viral #ytshort #youtuber

  2. doctor set kurkure unboxing #shorts

  3. Doctor Visit Dialogue in English for Beginners

  4. you can be a Doctor... visit temples...make reels...travel solo...excel in work... #drshrutishukla

  5. Shire Visit of Daata Darba Gunj bakhash hajveri #shirnesto #shortsfeed

  6. Medical Data Visualizer Seaborn/Pandas/Numpy

COMMENTS

  1. Medical Appointment Dataset Analysis

    SyntaxError: Unexpected token < in JSON at position 4. Refresh. Explore and run machine learning code with Kaggle Notebooks | Using data from Medical Appointment No Shows.

  2. 15 Open Datasets for Healthcare

    OpenfMRI: Other imaging data sets from MRI machines to foster research, better diagnostics, and training. It includes 95 datasets from 3372 subjects with new material being added as researchers make their own data open to the public. CT Medical Images: This one is a small dataset, but it's specifically cancer-related.

  3. Exploratory Data Analysis with Python: Medical Appointments Data

    The Exploratory Data Analysis (EDA) is a set of approaches which includes univariate, bivariate and multivariate visualization techniques, dimensionality reduction, cluster analysis. The main goal of EDA is to get a full understanding of the data and draw attention to its most important features in order to prepare it for applying more advanced ...

  4. FastStats

    Data are for the U.S. Adults who had a visit with a doctor or other health care professional. Percent of adults who had a visit with a doctor or other health care professional in the past year: 83.4% (2022) Source: Interactive Summary Health Statistics for Adults: National Health Interview Survey, 2019-2022.

  5. GitHub

    In this Mini project, we Explore the power of Data science by Analyzing Doctor visits using Python. We start by Collecting Data on Patient visits such as Age, Gender, Symptoms, and Diagnosis. Then, we use python libraries such as Pandas, Numpy and Matplotlib to Clean, Preprocessing, and Visualize the data. - SURYA2745/Doctor_Visit_Analysis

  6. Data Analytics and Modeling for Appointment No-show in Community Health

    Objectives: Using predictive modeling techniques, we developed and compared appointment no-show prediction models to better understand appointment adherence in underserved populations.Methods and Materials: We collected electronic health record (EHR) data and appointment data including patient, provider and clinical visit characteristics over a 3-year period.

  7. Solutions to Problem Set 4

    Solutions to Problem Set 4 Doctor Visits. Cameron and Trivedi (2009) have some interesting data on the number of office-based doctor visits by adults aged 25-64 based on the 2002 Medical Expenditure Panel Survey. We will use data for the most recent wave, available at https: ...

  8. A dataset of simulated patient-physician medical interviews with a

    These cases were simulated, recorded, transcribed, and manually corrected with the underlying aim of providing a comprehensive set of medical conversation data to the academic and industry community.

  9. Data

    Outpatient clinics' productivity largely depends on their appointment scheduling systems. It is crucial for appointment scheduling to understand the intrinsic heterogeneity in patient and service types and act accordingly. This article describes an outpatient clinic dataset of consultation service time with heterogeneous characteristics. The dataset contains 6637 consultation records ...

  10. Products

    NOTES: Visit rates are based on the July 1, 2018, set of estimates of the civilian noninstitutionalized population of the United States, as developed by the U.S. Census Bureau, Population Division. Total visits includes all visits by patients of all ages. Access data table for Figure 1 pdf icon.

  11. Case Study: Doctor Visit Analysis using Python

    Case Study: Doctor Visit Analysis using Python Case Study: Doctor Visit Analysis using Python

  12. 12 Notable Healthcare Datasets for 2022

    However, finding quality healthcare data to train these machines can be a challenge. Luckily, researchers, governments, and even private companies recognize the value of providing (anonymized) data to advance healthcare initiatives and the public good. Here are 12 notable healthcare datasets for 2022. V7 COVID-19 X-Ray Dataset.

  13. GitHub

    About. This project aims to analyze the "Doctor Visit Analysis" dataset to uncover insights into patient behavior and healthcare trends. The dataset encompasses details from diverse doctor visits, such as patient gender, illness specifics, age, income, and private or non-private sector affiliation.

  14. PDF Characteristics of Office-based Physician Visits, 2018

    Data from the National Ambulatory Medical Care Survey. In 2018, there were an estimated 267 office-based physician visits per 100 persons. The visit rate among females was higher than for males, and the rates for both infants and older adults were higher than the rates for those aged 1-64.

  15. (PDF) A Guide to Health Data Science Using Python

    This guide is a comprehensive guide to using Python programming for. data management, computation, descriptive statistics, visualization, spatial analysis, regression analysis and modeling for ...

  16. A Beginner's Guide to a Virtual Doctor's Visit

    Avoid positioning yourself in front of a bright window, as that obscures the view the provider will have of your face. Position your device so that your face is centered in the middle of the ...

  17. Primary Care Visit Regularity and Patient Outcomes: an Observational

    INTRODUCTION. Primary care is a key leverage point for transforming medical care and improving patient outcomes, especially for those with chronic disease. 1 High-quality primary care contributes to improved health outcomes, 2, 3 reduced health disparities, 4, 5 and lower cost of care. 6 The Chronic Care Model 1, 7 mentions regular primary care visits as an important ingredient for ...

  18. R: Australian Health Service Utilization Data

    A data frame containing 5,190 observations on 12 variables. Number of doctor visits in past 2 weeks. Factor indicating gender. Age in years divided by 100. Annual income in tens of thousands of dollars. Number of illnesses in past 2 weeks. Number of days of reduced activity in past 2 weeks due to illness or injury.

  19. Healthcare Datasets

    New off-the-shelf medical datasets are being collected across all data types. Contact us now to let go of your healthcare training data collection worries. Shaip high-quality Medical & Healthcare Datasets (Physician Audio, Transcribed Medical records, EHR, etc. from over 31 specialties) are a quick, cost-effective solution to train AI / Machine ...

  20. Life Satisfaction and Frequency of Doctor Visits

    Also, doctor visits were assessed using self-reported data. The validity of self-reported doctor visits has been supported in past research—the data show high agreement between self-reported doctor visits and medical records/administrative claims (28-31). Still, further research that uses objective measures of doctor visits would fine-tune ...

  21. Medical Expenditure Panel Survey Topics

    Description: Doctor visits/use/events consist of encounters that take place primarily in office-based settings and clinics. Doctor visit expenditures include all direct payments by individuals, private insurance (including TRICARE), Medicare, Medicaid, and other sources such as the Veterans Administration, Workers' Compensation, and miscellaneous public sources to providers of the services.

  22. dvisits function

    The data come from the Australian Health Survey of 1977-78 and consist of 5190 single adults where young and old have been oversampled. RDocumentation. Learn R. Search all packages and functions. faraway (version 1.0.8) Description Arguments. Format. Powered by ...

  23. Sexually transmitted infection rates rose among older people, CDC data

    CDC data shows that rates of chlamydia, gonorrhea and syphilis in people ages 55 and up more than doubled over a 10-year period. The trend is prompting doctors to call for more discussions with ...

  24. Middle East latest: Ships loaded with 240 tonnes of aid to leave Gaza

    The IDF has expressed its "sincere sorrow" for the deaths of seven aid workers, including three Britons, in a strike it launched on Gaza. A spokesman insisted the IDF was "committed to ...