1. Introduction
In the classical literature of urban travel behavior, particularly including studies that measure the effects of the built environment, the mediation of self-selection has always been a question. Self-selection can limit the effects of the urban environment and accessibility on urban travel behaviors. Thus, there is a constant need to understand its complex relations with trip demand and preferences. Self-selection results from two sources: attitudes and socio-demographics [
1]. One of the most important self-selections is residential location choice, which might impact different aspects of travel behavior such as trip generation and mode choice.
So far, the effects of different objective measures—including transport issues such as toll strategies [
2] and travel attributes [
3], as well as housing characteristics such as the number of bedrooms in the house [
4], housing price and home-school distance [
5], school quality [
6,
7], lot size and unit size [
7] and house space per person [
8]—on residential location choices have been examined. Although partially incomplete and inconsistent, the associations between individual and household characteristics and attitudes with residential self-selections have also been investigated, some of which are referred to in this paper. The role of the built environment in residential self-selections has been investigated mostly in Europe and the US [
1,
9,
10,
11,
12,
13,
14]. Nevertheless, the contextual differences of these relations have not yet been examined. Since self-selections are part of human preferences, they stem from cultures and lifestyles, and therefore it is expected that they vary based on geographical contexts—i.e., they are context-specific. This specificity has not been thoroughly investigated in urban travel behavior studies, particularly regarding the Middle East and North Africa (MENA). There are very limited studies on this region; e.g., we already know that in Iran socioeconomic factors might play a stronger role in defining residential location choices compared to mobility needs [
15]. In Alexandria, Egypt, availability of transportation modes, “nice neighborhoods”, and affordability are the strongest motives behind location decisions [
16]. Similar to Iran, in Alexandria, socio-economic factors are generally stronger than urban mobility and spatial issues. On a regional scale, when studying the reasons for low occupancy rates in the new cities in Egypt, it has been revealed that the six factors of current inhabitants, the estimated size of the target group, the size of new cities, total number of housing units, distance to nearby old city core, and distance to Greater Cairo are correlated with nation-wide location choices [
17]. We also know that residential location choice is positively correlated with urban sprawl around the workplaces of people (quantified by Shannon Entropy) in the large city of Hamedan, Iran [
18]. Choice of house location due to nearby workplace is significantly correlated with the level of urban sprawl (higher Shannon Entropy values) around workplaces. This probably refers to location choice near work when the workplace is located in sprawled areas, which usually offer fewer transportation choices such as public transit. Finally, in Abu Dhabi, United Arab Emirates, houses closer to points of interest are more likely to attract tenants [
19]. These findings address the importance of spatial accessibility in the city.
The literature on the MENA region is quite limited. Contexts in the neighboring regions such as South Asia can also be investigated with the aim of examining the topic within the contexts directly located outside of Western or European countries. In the small city of Hafizabad in Pakistan, located in the neighboring region of South Asia, the availability of utility services and affordability are the most decisive factors in the residential location preferences [
20]. In the same country, in the cities of Rawalpindi and Islamabad, accessibility to public transportation is correlated with house rent and demand [
21]. In the same region (South Asia), in the city of Nagpur, the choice of house location in lower-income households is significantly correlated with type of neighborhood and proximity to relatives and place of work, but it is dependent on shopping travel mode choice. However, for high-income households, monthly rent, type of neighborhood, proximity to parking facilities, and shopping mode choice are associated [
22]. In the city of Bhopal, the location choice predictors are different. For lower-income people, accessibility and economic attributes of housing stock are significant, while for wealthier residents, attributes of neighborhood characteristics are important [
23]. The connection between residential self-selection and modal choices, especially for non-motorized travels, has also been found in Rajkot, India [
24], and the connections with people’s satisfaction levels towards public transit availability have been shown in Delhi [
25].
However, these findings, whether on the MENA region or on the neighboring regions, are not comprehensive and consistent, so they do not provide a basis for a holistic, precise overview of the topic. In case of the MENA region, it is clear that the residential location predictors are more under-researched than some of other developing regions, such as South Asia. As a result, planning based on local behavioral science has not been facilitated. This knowledge gap has been targeted by this paper. From a methodological point of view, the studies on the MENA region have rarely applied Geographical Information Systems (GIS) to quantify land use in the disaggregate data. Most of the studies on the region have utilized statistical analysis methods determining either aggregate or disaggregate data, but they are limited by the data derived from questionnaires e.g., [
15,
16,
17]. An exception is, e.g., the work of Mehriar et al. (2020), who quantified the street network configuration and connectivity and brought the related variables into their models [
18]. Nevertheless, their models still did not target residential self-selection precisely, since this variable was only an independent one, of which the correlation with urban sprawl was measured. Along with the topical lack of studies mentioned above, this shows a methodological shortcoming in the housing preference studies of the region, namely a lack of methods like GIS that facilitate the capability to quantify the built environment. The studies that integrate the built environment into statistical models explaining the determinants of residential self-selection in the region or the connections with transport behaviors are rare, if not absent. Moreover, the studies on the topic using disaggregate data are uncommon, as well. The present study addresses these methodological shortcomings.
The objective of this paper is to examine the importance of mobility-related decisions on relocations in the large cities of the MENA region, exemplified by three megacities: Istanbul, Cairo, and Tehran. A side objective of this study is to investigate the correlation of age with residential location choice. Previous studies have already identified the need for further examination of these relations [
26]. These three cities have been selected because all fall within the widely used definition of the MENA region, the majority of the residents of the cities are Muslim, they share socio-cultural similarities, all three are megacities, and the transportation infrastructures have much in common, unlike neighboring regions including Europe, Central Asia, Africa, or South Asia (with the exception of trams and some mobility issues in Istanbul). Of course, there are some dissimilarities among transportation behaviors in these three cities, but such smaller differentiations are also seen in other regions. All these considerations justify the classification of these three cities in one category of cities for the purpose of this study.
The present study is significant and novel, because, firstly, the context is generally under-studied regarding residential self-selection and travel behavior. In fact, the role of transportation decisions in housing preferences can be a new topic for several developing regions. Secondly, this study involves the built environment in the residential location choice modeling in form of accessibility factors by means of GIS work. The GIS work also includes quantification of commuting distance based on street networks, which is considered to be a time and energy-consuming quantification. The combination of these two novelty factors can be interesting not only for the MENA region, but also internationally.
The paper continues with an explanation of the methods (
Section 2), including the case study, data collection, variables, and analysis methods such as binary probit modeling, sensitivity analysis, and hypothesis testing methods. Then, the findings of statistical methods are explained under
Section 3. In
Section 4, the contextual differences between the determinants of house locations in the MENA region are described in relation with the findings of high-income countries, and finally, some implications for planning purposes are explained.
2. Methodology
2.1. Questions and Hypotheses
The research questions of this study are as follows: (1) Which personal, household, socioeconomic, mobility-related, and built environment factors determine the residential location choices in Tehran, Istanbul, and Cairo? (2) Are residential location choice and daily commuting distance correlated? This question can be paraphrased as follows: Is there any significant difference between the daily commuting distance of people who have chosen their house location based on their mobility versus those who have done it based on other factors? (3) Are the time of the residential location choice and the time of the last relocation associated? This question can be reworded into the following: Are residential self-selections of people, who have chosen their residential location based on mobility and other factors, varying based on the time they relocated to their current home?
As the theoretical basis of the study, it is hypothesized that a wide range of factors including personal, household, socioeconomic, mobility-related, and built environment factors determine residential location choices in Tehran, Istanbul, and Cairo. These variables have deep origins in cultural and social issues and are strongly connected to the built environment. Consequently, some of the residential location choices are different from those of high-income and Western countries. In Tehran, Istanbul, and Cairo, the commuting distances and transport-based choices of house location are correlated. In other words, there is a significant difference between the daily commuting distance of people who have chosen their house location based on their mobility versus those who have done it based on other factors. Moreover, residential self-selection of people and the time of their last relocation are associated, meaning that the residential relocation motives (choosing house location based on mobility and other factors) vary according to the times passed after their relocation. These hypotheses are tested in the current paper.
2.2. Data and Variables
This empirical study is based on a mobility survey conducted in 2017 in 18 neighborhoods of Cairo, Istanbul, and Tehran by conducting 8284 face-to-face interviews (Cairo: 2786, Istanbul: 2781, Tehran: 2717). In each neighborhood, between 436 and 476 adults were interviewed. In each case city, two of the neighborhoods were located in the compact areas of the central parts, two were in areas on the periphery or in sprawled areas, and two were located in places with combined characteristics. The case study areas were selected with diversity of different urban forms and locations in mind. The compact neighborhoods were located in the vicinity of the central parts and the historical cores of the cities. These neighborhoods are often compact (but may not be so dense) and their street networks are not completely geometric. The second type of urban form included neighborhoods that originated between the early years of the twentieth century to around 1980. These areas are a combination of compact, old districts and semi-complete gridiron shapes. Finally, the third group of urban forms were newer districts built after 1980, which show characteristics of new quarters with complete grid street networks suitable for car use. The selected neighborhoods formed a good distribution of forms and dates and eras of construction/planning. Another criterion for choosing the neighborhoods was their size, i.e., area and population. An attempt to keep the size of the neighborhoods close to one another was made, so very large or very small neighborhoods were eliminated from the candidates.
The questionnaire consisted of 31 questions in six sections: socioeconomics and household profiles, commute and non-commute travel habits, perceptions about the urban environment, walking and biking infrastructures, and causes of mode choices. After production of land use variables by GIS as well as cleaning of the data, the number of developed variables reached 49, including 29 socioeconomic, perception, and mobility variables and 16 land use variables. In the resulting dataset, the neighborhood-level precisions were 4.5% to 4.7% for individual variables and 1.8% to 2.4% for household variables. The sub-samples of the study were representative in the level of neighborhood. The neighborhood sub-samples covered between 0.39% and 7.84% of the neighborhood population, when estimating the percentages based on the individuals. Considering that some of the questions targeted the household (such as monthly household income or household car ownership), the respondent to neighborhood residents’ ratio would be between 1.37% and 33.71%. The details of the survey have already been published in another paper [
27].
For the present study on trip generation, 24 out of the mentioned 49 variables were used as independent variables and residential self-selection was taken as dependent variable (13 categorical and 11 continuous). The table in the
Appendix A reflects the methods applied for quantification used in the model and statistical analyses of this paper. The respondents were asked about their residential self-selection with the following question: “Why did you choose this neighborhood to live in?” and they were asked to provide one dominant motive for the reason behind the selection of the place of their house by themselves or by their family members. They were given the following eight options: (a) the house was affordable to buy or rent, (b) the house was near to my workplace/school, (c) the surrounding environment is attractive, (d) the house will have higher price in the future, (e) to be near to our relatives and/or friends, (f) I live here since I was born/my childhood, (g) the house was easy for me to commute to my workplace/school, and finally, (h) public transportation is available around the neighborhood.
There are two other important variables that are examined in this paper, namely the time passed from the last relocation and daily commuting distance. For the former, the respondents were asked to provide one number in response to the question “how many years ago did your household move to the current home?” In order to quantify the commuting distance, interviewees were asked about the nearest location, landmark, or intersection to their home place and work/study place. It was designed in this way to not violate their privacy. Then, the interviewers marked their living and working places on maps and transferred to ArcGIS by pinpointing on Google Earth first. One-way daily commuting distances were estimated by ArcGIS based on the street networks of the three cities. The commute distances as well as land use variables were all quantified by the study team using ArcGIS in a 600 m catchment area around their homes. This data was connected to the data extracted from the questionnaire for each subject. Thus, a unique dataset was generated for data coming from the survey instrument and the land use quantification.
Figure 1 shows how the generation of land use variables by GIS was integrated into the data collection and other methodological sections of the work, such as background studies, questionnaire design, generation of an overall dataset, and statistical analysis.
2.3. Analysis Methods
Out of 8284 interviewees, 4779 respondents answered the related question about residential self-selection, and therefore this number reflects the sample size taken for modeling the determinants of location choices (research question 1). The mobility factors included three options in the original variable that included eight choices. Thus, this sample size was the basis of examining the location choices. In order to answer the first research question of this study, the residential location choice variable was transformed into a dummy (binary) variable with two options: other factors (coded 0) and mobility factors (coded 1). These were: (b) the house was near to my workplace/school, (g) the house was easy for me to commute to my workplace/school, and (h) public transportation is available around the neighborhood. The rest of the options were considered as non-mobility issues. For the modeling of this variable, probit regression modelling was applied with the mentioned variable as the dependent variable. The variables listed in
Appendix A were taken as independent variables.
The mobility factors were set as a reference category, so the variables and their categories (if any) were compared with reference to this choice. The model was rerun and the variables with highest
p-values were eliminated from the model. The first variables to be eliminated from the model were the ones with the highest
p-values. After 16 iterations, eight categorical and continuous variables were kept in the model and it was considered as the best possible model. The following variables were eliminated from the model: intersection density, link-node ratio, cycling, household income, gender, shopping-entertainment place, individual driving license, household car ownership, availability of attractive shopping centers, entertainment place, sense of belonging, frequency of commute trips, activity, subjective security of public transportation use, and shopping-entertainment mode choice inside the neighborhood. The final variables were shopping-entertainment mode, choice outside the neighborhood, frequency of public transit trips, neighborhood attractiveness perception, age, number of driving licenses in household, commuting distance, number of accessed facilities, and accessibility of facilities.
Table 1 summarizes the frequencies of the categorical variables of the model and
Table 2 shows the descriptive statistics of the continuous variables.
The second and third research questions of this paper seek associations between a dummy variable (residential self-selection) and two continuous variables (commuting distance and the time of the last relocation) separately. For testing the hypothesis of existence of difference in commuting distance and the time passing from the last relocation according to the residential self-selections, the Mann–Whitney U-test was applied, where p-values of less than 0.05 rejected the hypothesis of existence of similarity between the commuting distance and relocation time of those who selected their house location based on mobility and other factors. This nonparametric test of difference was applied because the two continuous variables were non-normal.
In order to check for correlations, it was necessary to break the continuous variables into two groups, one with lower values and one with higher values. In other words, it was needed to find the cutoff point at which a value change occurs, i.e., a point at which the mean commuting distances are significantly different for the two groups of residential location choices or the point at which the time passed from the last relocation is significantly different for the two mentioned groups. For defining these two cutoff points for daily commuting distance and the time of the last relocation, Receiver Operating Characteristic (ROC) curves for the two continuous variables were estimated. The ROC curves modeled the diagnostic ability of residential location choice, as its discrimination threshold is varied. These curves are the summary of four combined conditions (true positive (TP), true negative (TN), false positive (FP), false negative (FN)), which are based on two main conditions: positive (P: the number of real positive cases in the data) and negative (N: the number of real negative cases in the data). These conditions are estimated for all values of the continuous variables versus the binary variable based on a two-axis diagram with the true positive rate named as sensitivity on the vertical axis and false positive rate called 1-specificity. The ROC curve provides the best prediction capability when the area under the curve (AUC) is as high as 100% of the diagram area; in other words, the sensitivity of the continuous variable reaches 1. This happens under perfect conditions—which do not happen in real world applications—but normally an AUC of 90%, 80%, and 70% are excellent, good, and average models. However, in the case of the ROC curves of this study, the highest prediction power of the model was not sought; instead, it was intended to find the cutoff point. For finding the cutoff point, the nearest point on the curve with the shortest distance to the point in the top-left of the diagram (sensitivity = 1) is theoretically the cutoff point. This point can be found using the outputs of the SPSS software, in which the sensitivity and 1-specificity values were provided for each point on the curve. The shortest distance was calculated using Formula (1).
The Youden Index helps find the exact value of the cutoff point by means of Formula (2):
The value of the Youden Index is useful for finding the exact amount of the point that has the highest sensitivity and specificity at the same time, which will be the cutoff point. The theoretical range of the Youden Index is from −1 to 1, but the practical range in use is from 0 to 1 since negative values of the Youden Index do not have a physical meaning [
28]. Therefore, negative values of the Youden Index were omitted here and the related amount of the continuous variables were found for the two continuous variables.
The second step for finding the difference between commuting distance and the time passed from the last relocation for residential location choices was to compute the two continuous variables into two dummy variables using the cutoff points estimated by the ROC curves. The cutoff points were used to break these two variables from the turning points, resulting in finding a significant difference. The results of the Kolmogorov–Smirnov test show that the two continuous variables are non-normal (p < 0.001), thus, for finding the differences of values of the two continuous variables for residential location choice (other factors and mobility factors), the Kruskal–Wallis test was applied, whereas p-values of less than 0.05 indicated a significant change. In case a significant change between high and low values of the continuous variables was found, a significant association between residential location choice and the continuous variable was concluded.
4. Discussion
The results of the binary probit model of this study find age to be important in defining residential self-selection. Some of the findings of the model contradict the findings in high-income countries. For instance, the current study did not find any significant relation between location self-selections and car ownership, while in the Netherlands, this relation has been found to be important [
29]. In MENA cities, young people are more likely to choose their home location based on mobility needs, while older urban dwellers prioritize other factors. It has been found that in Nanjing, China, residential self-selection influences on travel behavior are different among the elderly (60+ years old) and younger respondents (18–59 years old) [
30]. This is in line with the findings of this study. China is considered to be an emerging market or, in some definitions, a developing country. In this specific case, the behaviors in the MENA region are similar to those in China. Moreover, in several Western studies, income has been found to be relevant in location decision [
31,
32,
33,
34], while in the model of this study, household monthly income was omitted from the model as it was not significant.
A very important issue that the current study raises is the importance of accessibility to local amenities, including the number of such facilities around homes and the walking distances to them. In the model developed in this study, these variables are highly significant. This finding is consistent with the conclusions of a recent study conducted by Baraklianos et al. (2020), who found accessibility factors to be of importance in their residential location choice model in the Lyon metropolitan area in France [
35]. The relations between residential location choice and accessibility of work place and different types of services have also been shown in the Stockholm region, Sweden [
36], though in this model, Eliasson did not categorize the location choices into mobility and non-mobility. The study also confirms the findings of Lee et al. (2010) which showed relations between location choices and cumulative opportunities for shopping in the Puget Sound region, USA [
37], as well as the findings of Guo (2004) about the importance of accessibility of shopping opportunities in the Dallas/Fort Worth metroplex, USA [
38], and finally the results of Zhang and Guhathakurta (2018), who found higher number of amenities to be important in Atlanta, USA [
39]. However, at the same time, the results reject the findings of some of the studies that did not recognize accessibility as a key factor in house location choices in Chicago, USA [
40], Melbourne, Australia [
41], and Santander, Spain [
42].
The findings of the current study regarding strong correlations of commuting distance with residential location choices are in line with the findings of Blijie in the Dutch context [
29]. The present study shows that the current commuting distance is in relation with the last relocation choice. This can also be compared to the findings of Chen et al. (2008), who found this relation between the prior commute distance and the last relocation in the Puget Sound region, USA [
43], as well as the work of Cockx and Canters (2020) which showed the effect of job accessibility on residential self-selection in Belgium [
44]. In general, the MENA findings show that the trade-off between mobility motives and other factors is becoming more serious in the large cities of the region, as it has already been shown that in the Western context, the subjective value of time as a component of commuting is compared to the household’s willingness to pay by rent to reduce commuting time [
45].
According to Fatmi and Habib (2017), individuals prefer to persist with their past commute mode [
46]. In this regard, the study on the MENA region shows that longer commuting distances in large cities and agglomerations might change the attitudes to changing house location. This change can theoretically happen in commuting distances of more than 8596 m as a threshold. Long commuting distances in weaved streets with high traffic congestion of such cities might encourage younger generations to live in the vicinity of their work/study places. The results of the statistical model of this paper as well as the hypothesis testing confirm this. Individuals who moved house less than 15.5 years ago have done so more strongly than those who moved earlier. This reflects a change in the lifestyles of people living in large cities. These findings are generalizable to up to 27 cities, each accommodating at least one million people, according to a recent study [
47]. The change in the attitudes of younger generations strengthens the ties between residential self-selection with urban travel behavior in the region. On one hand, it adds to the complexity of the factors influencing travel demand; on the other hand, it can be used for local urban and transportation planners as a basis for policymaking. For instance, based on the findings of this paper, it is clear that in the case of availability of employment centers and jobs less than four kilometers distance from the living places and residents, people consider this a proper commuting distance. It has been shown in this paper that people who have chosen their living location based on mobility preferences live in places as near as 4298 m to their working place.
This finding is related to the question asked by De Vos and Witlox: “Do people live in urban neighborhoods because they do not like to travel?” [
48]. The response of this study is yes, but it might be related to the attitudes and perceptions of people. If commuting is so important to them that they choose their house location based on it, then they are likely to choose a location of less than 4.3 km distance to the workplace of one of the most frequently commuting household members. This assumption, which is largely suggested by this study, can be adopted for urban planning policymaking by providing employment clusters in less dense areas of cities or the metropolitan regions that are aimed to attract more residents in the future, with the purpose of defining the development orientation of the city. Of course, this paper can only suggest this strategy for the larger cities of the MENA region, since it is backed only by empirical findings of this region.
Another input of this study to urban planning and housing policy is that cheaper, social, or affordable housing can be targeted by urban development plans on the periphery of the large cities, if employment has been already thought of. This study shows that younger generations may give more importance to commuting in choosing their house location compared to previous generations. Thus, urban policymakers can attract them to new quarters only if there are working opportunities nearby. This is linked to studies that suggest providing jobs with the purpose of turning urban sprawl from a problem to an opportunity, i.e., making commuting travels shorter [
49,
50]. Of course, this strategy is not suitable for medium-sized or small cities in the region, as one cannot assume that residents would choose commuting necessities over socioeconomic motives without empirical results.
These planning opportunities can improve the reciprocal functions of land use and transport behavior. On one hand, urban growth can be controlled, while on the other hand, travel behavior can be directed towards more sustainable and perhaps more active modes. The previous urban planning studies have shown that in Egypt and Iran, a lack, deficiency, or absence of urban development plans have led to urban sprawl [
51]. Utilizing land use–transportation integrated planning can ease some of the long-lasting problems of cities like Tehran, Istanbul, and Cairo in urban travels, such as traffic congestion, long commuting, and transport-related environmental pollutions as well as urban sprawl and its negative social and financial impacts. From this perspective, this study is in line with the approach of Western solutions of the past four decades.
5. Conclusions
The present study sheds light on the determinants of residential location choice in the less-studied context of the MENA region and at the same time finds some of the contextual differences with these determinants in high-income and mostly Western countries. The eight variables of shopping-entertainment mode choice outside the neighborhood (in faraway places), frequency of public transit trips, neighborhood attractiveness perception, age, number of driving license in household, commuting distance, number of accessed facilities, and the (walkable) accessibility to facilities have been recognized to influence residential self-selections. Moreover, it has been found that people who have chosen their current home based on mobility commute a daily mean distance of 8596 m, while those who chose their home based on other reasons such as socioeconomics or personal reasons commute longer. Finally, people whose location choice is based on transportation reasons are likely to have moved to their new home less than 15.5 years ago, while those who moved before that date likely had other reasons. This shows how the attitudes of people towards residential location have changed in the MENA region. Such findings are of importance from a basic scientific point of view; at the same time, decision makers and urban planners can use them for the purpose of making commuting more sustainable.
Although the sample size of this study is enough for providing the necessary power for the analyses, this study has its own limitations, e.g., the role of the personal life events in choosing house locations was not investigated by this study. Life-course events such as completing school or university studies, marriage, starting a new job, having a child, etc. may significantly affect the location choices. These correlations were not investigated here, because this study was primarily designed not only to examine the housing preferences, but also to investigate the travel behaviors in the three cities. In the future, in studies that are fundamentally designed for the purpose of studying housing and relocation tendencies, the investigation of the relations between residential self-selection and “mobility biographies” in the MENA region will be intriguing.