<?xml version="1.0" encoding="UTF-8" standalone="no"?>
<!DOCTYPE article PUBLIC "-//NLM//DTD Journal Publishing DTD v2.3 20070202//EN" "journalpublishing.dtd">
<article xml:lang="EN" xmlns:mml="http://www.w3.org/1998/Math/MathML" xmlns:xlink="http://www.w3.org/1999/xlink" article-type="review-article">
<front>
<journal-meta>
<journal-id journal-id-type="publisher-id">Bohr. Scit.</journal-id>
<journal-title>BOHR International Journal of Smart Computing and Information Technology</journal-title>
<abbrev-journal-title abbrev-type="pubmed">Bohr. Scit.</abbrev-journal-title>
<issn pub-type="epub">2583-2026</issn>
<publisher>
<publisher-name>BOHR</publisher-name>
</publisher>
</journal-meta>
<article-meta>
<article-id pub-id-type="doi">10.54646/bijscit.2022.23</article-id>
<article-categories>
<subj-group subj-group-type="heading">
<subject>Review</subject>
</subj-group>
</article-categories>
<title-group>
<article-title>Analysis of the influence factors of global film box office based on a log-linear model</article-title>
</title-group>
<contrib-group>
<contrib contrib-type="author" corresp="yes">
<name><surname>Xie</surname> <given-names>Xinyu</given-names></name>
<xref ref-type="corresp" rid="c001"><sup>&#x002A;</sup></xref>
</contrib>
</contrib-group>
<aff><institution>School of Management, Tianjin University of Technology</institution>, <addr-line>Tianjin</addr-line>, <country>China</country></aff>
<author-notes>
<corresp id="c001">&#x002A;Correspondence: Xinyu Xie, <email>xiexinyu666@foxmail.com</email></corresp>
</author-notes>
<pub-date pub-type="epub">
<day>01</day>
<month>04</month>
<year>2022</year>
</pub-date>
<volume>3</volume>
<issue>1</issue>
<fpage>17</fpage>
<lpage>24</lpage>
<history>
<date date-type="received">
<day>24</day>
<month>02</month>
<year>2022</year>
</date>
<date date-type="accepted">
<day>18</day>
<month>03</month>
<year>2022</year>
</date>
</history>
<permissions>
<copyright-statement>Copyright &#x00A9; 2022 Xie.</copyright-statement>
<copyright-year>2022</copyright-year>
<copyright-holder>Xie</copyright-holder>
<license xlink:href="https://creativecommons.org/licenses/by-nc-nd/4.0/"><p>This is an open-access article distributed under the terms of the Creative Commons Attribution License (CC BY). The use, distribution or reproduction in other forums is permitted, provided the original author(s) and the copyright owner(s) are credited and that the original publication in this journal is cited, in accordance with accepted academic practice. No use, distribution or reproduction is permitted which does not comply with these terms.</p></license>
</permissions>
<abstract>
<p>In this paper, python is used to obtain the data of the top 100 global movie ticket rooms, and through data visualization, it is concluded that the global movie box office will peak in the next 10 years. Science fiction action movies are more popular. r-language was used to construct log-linear models, respectively. The results of the random unit group design model showed that science fiction action movies were more favorable to the box office. Through k-means clustering analysis, it was concluded that the guidance of well-known directors had more positive effects on the box office. According to the above conclusions, the major cinemas and film and television companies make relevant suggestions.</p>
</abstract>
<kwd-group>
<kwd>r-language</kwd>
<kwd>log-linear model</kwd>
<kwd>random unit group design model</kwd>
<kwd>k-means clustering</kwd>
</kwd-group>
<counts>
<fig-count count="5"/>
<table-count count="2"/>
<equation-count count="9"/>
<ref-count count="30"/>
<page-count count="8"/>
<word-count count="4704"/>
</counts>
</article-meta>
</front>
<body>
<sec id="S1" sec-type="intro">
<title>1. Introduction</title>
<p>As global economy continues to grow, the potential value of the film market continues to increase and the film industry is growing rapidly. However, this has given rise to many problems in the film industry. More and more scholars have begun to pay attention to research on the film industry, and more and more quantitative studies have been conducted on the factors influencing the movie box office. Focusing on the domestic market, China&#x2019;s box office has only emerged as a global player since 2002 when the movie theater system was reformed and the movie market was activated, but it is still at an extreme disadvantage. Since 2002, Chinese movies have embarked on an industrialized path of development, with the total annual box office volume growing from around 1 billion RMB to 63 billion RMB in 2019. However, from 1982 to 2016, China&#x2019;s box office remained at the bottom of the global ranking. It was not until 2017 that the domestic movie &#x201C;War Wolf 2&#x201D; took a place in the global movie box office, which can be far behind European and American movies. This paper examines the data of the top 100 global movie box office from 1982 to 2022. Through quantitative analysis of different kinds of movies, we can determine which kinds of movies are more popular with the public. In turn, relevant conclusions are drawn to encourage the domestic film industry to produce films of that type. In addition, it provides a reference for the types of movies released by domestic cinemas and provides a basis for arranging the frequency of movies. It visualizes the time dimension of movie box office data, analyzes the trend of movie box office, and predicts future changes. It provides a basis for decision-making in the movie industry.</p>
</sec>
<sec id="S2">
<title>2. Literature review</title>
<p>Since the 1960s, there has been a growing body of research on the factors influencing movie box office.</p>
<p>Liao Lin et al. studied the influence of social media marketing channels and events on movie box office through the likelihood model and concluded that social media can improve customers&#x2019; purchase intention through relevant routes (<xref ref-type="bibr" rid="B1">1</xref>). Lee Young-Jin et al. used the two-stage instrumental variable method with fixed effect to explain the heterogeneity of films and the simultaneous relationship between user reviews, advertising expenditure, and sales through the box office data of American films and evaluated the impact of advertising expenditure on sales after the release of films (<xref ref-type="bibr" rid="B2">2</xref>). Wei lu determined the main index system affecting film box office based on China&#x2019;s film box office, combined with a questionnaire survey and expert interview, and provided a valuable reference for future risk control and film investment decision through a neural network prediction model (<xref ref-type="bibr" rid="B3">3</xref>). Ya-Han Hu et al. used emotional tools to quantify film reviews. They combined the quantified data with basic film information and external environmental factors and predicted the performance of film box office through the model tree (M5P) model, linear regression model, and support vector regression model (<xref ref-type="bibr" rid="B4">4</xref>). Oh Chong et al. used the ordinary least squares (OLS) regression model and found that consumer participation behavior on Facebook and YouTube was positively correlated with the total box office revenue. However, the same effect was not observed on Twitter. The results show the importance of investing in social media communication in multiple channels (<xref ref-type="bibr" rid="B5">5</xref>). Yong Lou et al. studied the dynamic patterns of word of mouth and how it helps explain box office receipts by using actual word-of-mouth (word of mouth) information. The results show that word-of-mouth information has an important explanatory power for both gross and weekly box office receipts, especially in the first few weeks after a film&#x2019;s release. However, this explanatory power comes mainly from the amount of word of mouth, not its value, which is measured by the percentage of positive and negative information (<xref ref-type="bibr" rid="B6">6</xref>). Tingting Song et al. built a prediction model of movie box office revenue by studying the complex relationship between movie box office revenue and user-generated content (UGC) on the Weibo platform, marketer-generated content (MGC), and UGC on third-party platforms. It is found through research that the volume of corporate microblogging (MGC) can not only directly predict the box office revenue but also indirectly predict the box office revenue through MUGC. Therefore, MUGC plays a partial intermediary role in predicting the relationship between the volume of corporate microblogging and the box office revenue (<xref ref-type="bibr" rid="B7">7</xref>).</p>
<p>Elliot Mbunge et al. applied the PRISMA model to review published papers from 2010 to 2019 extracted from Google Scholar, Science Direct, IEEE Xplore Digital Library, ACM Digital Library, and Springer Link. The study shows that support vector machine has the highest frequency of predicting box office success, accounting for 21.74%, followed by linear regression, accounting for 17.39% of the total frequency contribution. This study also provides some valuable references for this paper (<xref ref-type="bibr" rid="B8">8</xref>). U. Ahmed et al. predicted the box office before film production based on actors&#x2019; experience, journalists&#x2019; comments, media reports, user ratings, and income generated by associated films and other information (<xref ref-type="bibr" rid="B9">9</xref>). L. Kang et al. examined the role of film quality signals (such as star power, Internet media reviews, and industry recognition) in an empirical analysis of the relationship between family friendly content and box office receipts in films by analyzing Chinese film data from 2009 to 2018. The results show that explicit sex and profanity in films have a negative and statistically significant effect on box office takings, confirming the role of cultural values in the economic success of Chinese films. However, in big-budget films involving superstars, the violent and bloody (graphic violence) content attracts the audience, so the box office revenue increases (<xref ref-type="bibr" rid="B10">10</xref>). Based on the Chinese film market, this paper considers the influencing factors of film box office from multiple dimensions, uses the joint analysis method of questionnaire survey and expert interview to determine the main index system affecting MBO, and then establishes the MBO prediction model through the neural network BRP method to predict the electronic music box office (<xref ref-type="bibr" rid="B11">11</xref>). B. S. Wibowo et al. studied the key factors of the box office failure and success of Indonesian local films, and the results showed that the success of local films was mainly driven by the popularity of actors and the existence of foreign films at the box office (<xref ref-type="bibr" rid="B12">12</xref>). L. A. B. S. Ghazal et al. studied the data of 361 movies, and the results showed that the two factors affecting box office success were the number of days the movie was released and the number of theaters. In addition, the proximity of film release dates to seasonal holidays in Malaysia is quite consistent in producing a positive correlation (<xref ref-type="bibr" rid="B3">3</xref>). L. I. Pei-zhi et al. established a hybrid prediction model based on web search data. First, the optimal training set (OTS) was constructed by matching the training data most similar to the test set. Second, the Empire competition algorithm (ICA) is used to select the best parameter combination of least square support vector machine (LSSVM). Finally, the optimization model is used for prediction (<xref ref-type="bibr" rid="B13">13</xref>).</p>
<p>Although many domestic and foreign scholars have studied the factors influencing movie box office, few experts and scholars have conducted multivariate statistical analysis studies using the top 100 global movie box office data from 1982 to 2021. The act of using movie categories as influencing factors to study movie box office is even rarer. In addition, this paper will use cluster analysis methods to classify different box office different categories of box office and then draw correlation conclusions.</p>
</sec>
<sec id="S3">
<title>3. Data description and processing</title>
<sec id="S3.SS1">
<title>3.1. Data mining</title>
<p>Using Python&#x2019;s requests function to crawl the <ext-link ext-link-type="uri" xlink:href="http://www.piaofang.biz/">http://www.piaofang.biz/</ext-link> website data, we obtained the top 100 global movie box office data from 1982 to 2022, which contains information such as &#x201C;movie name,&#x201D; &#x201C;release date,&#x201D; &#x201C;movie type,&#x201D; &#x201C;movie box office,&#x201D; &#x201C;director name,&#x201D; and so on. The data include &#x201C;movie name,&#x201D; &#x201C;release date,&#x201D; &#x201C;movie type,&#x201D; &#x201C;movie box office,&#x201D; &#x201C;director name,&#x201D; and so on.</p>
</sec>
<sec id="S3.SS2">
<title>3.2. Digital description of the data</title>
<p>The resulting data were digitally described, and the digitized description table is shown in <xref ref-type="table" rid="T1">Table 1</xref>.</p>
<table-wrap position="float" id="T1">
<label>TABLE 1</label>
<caption><p>Table describing the digitization of variables.</p></caption>
<table cellspacing="5" cellpadding="5" frame="hsides" rules="groups">
<thead>
<tr>
<td valign="top" align="left">Category name</td>
<td valign="top" align="center">Quantifying the numbers</td>
</tr>
</thead>
<tbody>
<tr>
<td valign="top" align="left">Science fiction</td>
<td valign="top" align="center">1</td>
</tr>
<tr>
<td valign="top" align="left">Disaster</td>
<td valign="top" align="center">2</td>
</tr>
<tr>
<td valign="top" align="left">Action</td>
<td valign="top" align="center">3</td>
</tr>
<tr>
<td valign="top" align="left">Love</td>
<td valign="top" align="center">4</td>
</tr>
<tr>
<td valign="top" align="left">Adventure</td>
<td valign="top" align="center">5</td>
</tr>
<tr>
<td valign="top" align="left">Animation</td>
<td valign="top" align="center">6</td>
</tr>
<tr>
<td valign="top" align="left">Fantasy</td>
<td valign="top" align="center">7</td>
</tr>
<tr>
<td valign="top" align="left">Comedy</td>
<td valign="top" align="center">8</td>
</tr>
<tr>
<td valign="top" align="left">Gun battle</td>
<td valign="top" align="center">9</td>
</tr>
<tr>
<td valign="top" align="left">Thriller</td>
<td valign="top" align="center">10</td>
</tr>
<tr>
<td valign="top" align="left">Crime</td>
<td valign="top" align="center">11</td>
</tr>
<tr>
<td valign="top" align="left">War</td>
<td valign="top" align="center">12</td>
</tr>
<tr>
<td valign="top" align="left">Music</td>
<td valign="top" align="center">13</td>
</tr>
<tr>
<td valign="top" align="left">Biography</td>
<td valign="top" align="center">14</td>
</tr>
<tr>
<td valign="top" align="left">Military</td>
<td valign="top" align="center">15</td>
</tr>
<tr>
<td valign="top" align="left">Intimacy</td>
<td valign="top" align="center">16</td>
</tr>
<tr>
<td valign="top" align="left">Family</td>
<td valign="top" align="center">17</td>
</tr>
</tbody>
</table></table-wrap>
</sec>
<sec id="S3.SS3">
<title>3.3. Data cleaning and missing value processing</title>
<p>Using Python to clean and process the noisy data, duplicate data, and so on after screening, we found that the data of the movie &#x201C;War Wolf 2&#x201D; category were missing in these data, so we used the random forest model to fill in the missing values in <xref ref-type="table" rid="T2">Table 2</xref>.</p>
<table-wrap position="float" id="T2">
<label>TABLE 2</label>
<caption><p>Table of variables of the random forest model.</p></caption>
<table cellspacing="5" cellpadding="5" frame="hsides" rules="groups">
<thead>
<tr>
<td valign="top" align="left">Model variables</td>
<td valign="top" align="left">Variable meaning</td>
</tr>
</thead>
<tbody>
<tr>
<td valign="top" align="left">P</td>
<td valign="top" align="left">Probability of each species</td>
</tr>
<tr>
<td valign="top" align="left">H</td>
<td valign="top" align="left">Current leaf node entropy value</td>
</tr>
<tr>
<td valign="top" align="left">ID</td>
<td valign="top" align="left">GINI factor</td>
</tr>
<tr>
<td valign="top" align="left">n</td>
<td valign="top" align="left">Total data</td>
</tr>
<tr>
<td valign="top" align="left">C</td>
<td valign="top" align="left">Rate the value</td>
</tr>
<tr>
<td valign="top" align="left">V</td>
<td valign="top" align="left">Current node data volume</td>
</tr>
</tbody>
</table></table-wrap>
</sec>
<sec id="S3.SS4">
<title>3.3.1. Random forest model</title>
<p>The random forest algorithm is used to build a decision tree with the movie category 2 as the label, the release time, category 1, and the movie director&#x2019;s name as the feature values, and the information entropy and information gain rate of each node are calculated to determine the root node and child nodes of the decision tree. Multiple decision trees are built to form a random forest, and each decision tree randomly selects 60% of the data in Form 1 for training and finally performs the prediction of missing values of category 2. The effect of the prediction is evaluated by calculating the evaluation function for each decision tree, and the final filled kind 2 missing value is 15 (i.e., military kinds).</p>
<p>The random forest computational model is as follows.</p>
<disp-formula id="S3.Ex1"><mml:math id="M1">
<mml:mrow>
<mml:mrow>
<mml:mi>H</mml:mi>
<mml:mo>&#x2062;</mml:mo>
<mml:mrow>
<mml:mo stretchy="false">(</mml:mo>
<mml:mi>x</mml:mi>
<mml:mo stretchy="false">)</mml:mo>
</mml:mrow>
</mml:mrow>
<mml:mo>=</mml:mo>
<mml:mrow>
<mml:mtext> </mml:mtext>
<mml:mo>-</mml:mo>
<mml:mrow>
<mml:munderover>
<mml:mo movablelimits="false">&#x2211;</mml:mo>
<mml:mrow>
<mml:mi>i</mml:mi>
<mml:mo>=</mml:mo>
<mml:mn>1</mml:mn>
</mml:mrow>
<mml:mi>n</mml:mi>
</mml:munderover>
<mml:mrow>
<mml:msub>
<mml:mi>P</mml:mi>
<mml:mi>i</mml:mi>
</mml:msub>
<mml:mo>&#x2062;</mml:mo>
<mml:mi>L</mml:mi>
<mml:mo>&#x2062;</mml:mo>
<mml:mi>n</mml:mi>
<mml:mo>&#x2062;</mml:mo>
<mml:mrow>
<mml:mo>(</mml:mo>
<mml:msub>
<mml:mi>P</mml:mi>
<mml:mi>i</mml:mi>
</mml:msub>
<mml:mo>)</mml:mo>
</mml:mrow>
</mml:mrow>
</mml:mrow>
</mml:mrow>
</mml:mrow>
</mml:math>
</disp-formula>
<disp-formula id="S3.Ex2"><mml:math id="M2">
<mml:mrow>
<mml:mrow>
<mml:mi>I</mml:mi>
<mml:mo>&#x2062;</mml:mo>
<mml:mi>D</mml:mi>
</mml:mrow>
<mml:mo>=</mml:mo>
<mml:mfrac>
<mml:mrow>
<mml:mi>H</mml:mi>
<mml:mo>&#x2062;</mml:mo>
<mml:mrow>
<mml:mo stretchy="false">(</mml:mo>
<mml:mi>x</mml:mi>
<mml:mo stretchy="false">)</mml:mo>
</mml:mrow>
</mml:mrow>
<mml:mrow>
<mml:munderover>
<mml:mo movablelimits="false">&#x2211;</mml:mo>
<mml:mrow>
<mml:mi>i</mml:mi>
<mml:mo>=</mml:mo>
<mml:mn>1</mml:mn>
</mml:mrow>
<mml:mi>n</mml:mi>
</mml:munderover>
<mml:mo>-</mml:mo>
<mml:mrow>
<mml:mfrac>
<mml:mn>1</mml:mn>
<mml:mi>n</mml:mi>
</mml:mfrac>
<mml:mo>&#x2062;</mml:mo>
<mml:mi>l</mml:mi>
<mml:mo>&#x2062;</mml:mo>
<mml:mi>g</mml:mi>
<mml:mo>&#x2062;</mml:mo>
<mml:mi>n</mml:mi>
</mml:mrow>
</mml:mrow>
</mml:mfrac>
</mml:mrow>
</mml:math>
</disp-formula>
<p>Evaluation function:</p>
<disp-formula id="S3.Ex3"><mml:math id="M3">
<mml:mrow>
<mml:mrow>
<mml:mi>C</mml:mi>
<mml:mo>&#x2062;</mml:mo>
<mml:mrow>
<mml:mo stretchy="false">(</mml:mo>
<mml:mi>T</mml:mi>
<mml:mo stretchy="false">)</mml:mo>
</mml:mrow>
</mml:mrow>
<mml:mo>=</mml:mo>
<mml:mrow>
<mml:munder>
<mml:mo movablelimits="false">&#x2211;</mml:mo>
<mml:mrow>
<mml:mi>t</mml:mi>
<mml:mo>&#x2208;</mml:mo>
<mml:mrow>
<mml:mi>l</mml:mi>
<mml:mo>&#x2062;</mml:mo>
<mml:mi>e</mml:mi>
<mml:mo>&#x2062;</mml:mo>
<mml:mi>a</mml:mi>
<mml:mo>&#x2062;</mml:mo>
<mml:mi>f</mml:mi>
</mml:mrow>
</mml:mrow>
</mml:munder>
<mml:mrow>
<mml:msub>
<mml:mi>V</mml:mi>
<mml:mi>t</mml:mi>
</mml:msub>
<mml:mo>&#x2062;</mml:mo>
<mml:mi>H</mml:mi>
<mml:mo>&#x2062;</mml:mo>
<mml:mrow>
<mml:mo stretchy="false">(</mml:mo>
<mml:mi>t</mml:mi>
<mml:mo stretchy="false">)</mml:mo>
</mml:mrow>
</mml:mrow>
</mml:mrow>
</mml:mrow>
</mml:math>
</disp-formula>
</sec>
<sec id="S3.SS5">
<title>3.4. Preliminary analysis of data visualization</title>
<p>A time series plot of these data with the time variable as the horizontal axis and the box office as the vertical axis is shown in <xref ref-type="fig" rid="F1">Figure 1</xref>.</p>
<fig id="F1" position="float">
<label>FIGURE 1</label>
<caption><p>Time series chart.</p></caption>
<graphic mimetype="image" mime-subtype="tiff" xlink:href="bijscit-2022-23-g001.tif"/>
</fig>
<p>As shown in the figure from 1982 to around 1998, the movie box office showed an increasing trend, and in 2009, the movie box office reached its peak, and thereafter until 2022, the highest value of the movie box office still did not exceed 2009. Through further research, it can be seen that although the box office in 2009 was high for the movie &#x201C;Avatar&#x201D; whose movie special effects were perfect, the plot was appealing, and the director was the experienced James Cameron, so not only that but many of James Cameron&#x2019;s films are listed in the top 100 films at the global box office. In addition, the sequence shows an up-and-down trend, with peaks occurring every 10 years or so. The graph inferred that there is a high probability that the peak will still occur in the next few years, but whether the box office will surpass the movie Avatar still needs further research and analysis.</p>
<p>Each genre is disaggregated and combined vertically, and the box office of each genre is averaged. A pie chart of the average box office of different genres is drawn. This is shown in <xref ref-type="fig" rid="F2">Figure 2</xref>.</p>
<fig id="F2" position="float">
<label>FIGURE 2</label>
<caption><p>Average box office of different movie types.</p></caption>
<graphic mimetype="image" mime-subtype="tiff" xlink:href="bijscit-2022-23-g002.tif"/>
</fig>
<p>As shown in <xref ref-type="fig" rid="F2">Figure 2</xref>, the average box office of romance movies is higher, but further research shows that there are fewer movies in this category, so the average is higher, which does not mean that the box office of all these movies is higher, followed by the second highest percentage of the average box office of disaster movies. The average box office share of family movies and military movies is lower.</p>
<p>Therefore, film companies should carefully consider shooting romance movies, although the average box office of romance movies is high according to <xref ref-type="fig" rid="F2">Figure 2</xref>. The average box office of such movies is high due to the small number of data, which is not necessarily suitable for the company&#x2019;s profitability. In addition, from the perspective of profitability, they should shoot less number of military, family, and affection movies because the box office revenue of such movies is generally not high. They should shoot more action, science fiction, comedy, and disaster movies. The reason for this is that these films account for the majority of the data, and the average box office is still in the middle to upper range.</p>
</sec>
</sec>
<sec id="S4">
<title>4. Model construction</title>
<sec id="S4.SS1">
<title>4.1. Log-linear model</title>
<p>The mathematical model was constructed with box office as the dependent variable and factors such as category 1, category 2, time, and director as the independent variables. The independent variables were analyzed as categorical variables, while the dependent variables could be considered categorical variables or continuous variables, so different mathematical models were used to fit the obtained data and compare which model was more suitable.</p>
<p>First, the dependent variable was considered a multicategorical variable, so a log-linear model was used to fit the data. Type 1 and type 2 are two qualitative variables, and the director can be considered a continuous variable. Therefore, there is the formula:</p>
<disp-formula id="S4.Ex4"><mml:math id="M4">
<mml:mrow>
<mml:mrow>
<mml:mi>ln</mml:mi>
<mml:mo>&#x2061;</mml:mo>
<mml:mrow>
<mml:mo stretchy="false">(</mml:mo>
<mml:mi>&#x03BB;</mml:mi>
<mml:mo stretchy="false">)</mml:mo>
</mml:mrow>
</mml:mrow>
<mml:mo>=</mml:mo>
<mml:mrow>
<mml:mi mathvariant="normal">&#x03BC;</mml:mi>
<mml:mo>+</mml:mo>
<mml:msub>
<mml:mi>&#x03B1;</mml:mi>
<mml:mi>i</mml:mi>
</mml:msub>
<mml:mo>+</mml:mo>
<mml:msub>
<mml:mi>&#x03B2;</mml:mi>
<mml:mi>j</mml:mi>
</mml:msub>
<mml:mo>+</mml:mo>
<mml:mrow>
<mml:mi>&#x03B3;</mml:mi>
<mml:mo>&#x2062;</mml:mo>
<mml:mi>x</mml:mi>
</mml:mrow>
<mml:mo>+</mml:mo>
<mml:msub>
<mml:mi>&#x03B5;</mml:mi>
<mml:mrow>
<mml:mi>i</mml:mi>
<mml:mo>&#x2062;</mml:mo>
<mml:mi>j</mml:mi>
</mml:mrow>
</mml:msub>
</mml:mrow>
</mml:mrow>
</mml:math>
</disp-formula>
<p>where &#x03BC; is the constant term, &#x03B1;<sub>i</sub> and &#x03B2;<sub>i</sub> are the main effects of the two qualitative variables type 1 and type 2, <italic>x</italic> is the director continuous variable, while &#x03B3; is its coefficient and &#x03B5;<sub><italic>ij</italic></sub> is the residual term. The positive parameter &#x03BB; for the Poisson distribution is taken logarithmically in order to be the left side of the model that takes the entire range of values of the real axis.</p>
</sec>
<sec id="S4.SS2">
<title>4.2. Stochastic unit group design modeling</title>
<p>Considering the dependent variable as a continuous variable and the kind1, director as a qualitative factor, a randomized unit group design model can be used to fit the data to the model. After treating factor <italic>x</italic><sub><italic>2</italic></sub> with <italic>G</italic>levels and unit group <italic>x</italic><sub><italic>4</italic></sub> with <italic>n</italic>, which can be viewed as <italic>n</italic> levels, and generating <italic>x</italic><sub><italic>2</italic></sub> dummy variables for <italic>G</italic> and <italic>x</italic><sub><italic>4</italic></sub> dummy variables for unit group <italic>n</italic>, respectively, the box office results <italic>y</italic><sub><italic>ij</italic></sub> are expressed as follows:</p>
<disp-formula id="S4.Ex5"><mml:math id="M5">
<mml:mrow>
<mml:msub>
<mml:mi>y</mml:mi>
<mml:mrow>
<mml:mi>i</mml:mi>
<mml:mo>&#x2062;</mml:mo>
<mml:mi>j</mml:mi>
</mml:mrow>
</mml:msub>
<mml:mo>=</mml:mo>
<mml:mi>&#x03BC;</mml:mi>
<mml:mo>+</mml:mo>
<mml:msub>
<mml:mi>&#x03B1;</mml:mi>
<mml:mi>i</mml:mi>
</mml:msub>
<mml:mo>+</mml:mo>
<mml:msub>
<mml:mi>&#x03B2;</mml:mi>
<mml:mi>j</mml:mi>
</mml:msub>
<mml:mo>+</mml:mo>
<mml:msub>
<mml:mi>e</mml:mi>
<mml:mrow>
<mml:mi>i</mml:mi>
<mml:mo>&#x2062;</mml:mo>
<mml:mi>j</mml:mi>
</mml:mrow>
</mml:msub>
<mml:mrow>
<mml:mo stretchy="false">(</mml:mo>
<mml:mi>i</mml:mi>
<mml:mo>=</mml:mo>
<mml:mn>1</mml:mn>
<mml:mo>,</mml:mo>
<mml:mn>2</mml:mn>
<mml:mo>,</mml:mo>
<mml:mi mathvariant="normal">&#x2026;</mml:mi>
<mml:mi mathvariant="normal">&#x2026;</mml:mi>
<mml:mi mathvariant="normal">&#x2026;</mml:mi>
<mml:mo>.</mml:mo>
<mml:mo>,</mml:mo>
<mml:mi>G</mml:mi>
<mml:mo>;</mml:mo>
<mml:mi>j</mml:mi>
<mml:mo>=</mml:mo>
<mml:mn>1</mml:mn>
<mml:mo>,</mml:mo>
<mml:mn>2</mml:mn>
<mml:mi mathvariant="normal">&#x2026;</mml:mi>
<mml:mi mathvariant="normal">&#x2026;</mml:mi>
<mml:mo>.</mml:mo>
<mml:mi>n</mml:mi>
<mml:mo stretchy="false">)</mml:mo>
</mml:mrow>
</mml:mrow>
</mml:math>
</disp-formula>
<p>After further processing, it is obtained that:</p>
<disp-formula id="S4.Ex6"><mml:math id="M6">
<mml:mrow>
<mml:mi>Y</mml:mi>
<mml:mo>=</mml:mo>
<mml:mrow>
<mml:mrow>
<mml:mi>X</mml:mi>
<mml:mo>&#x2062;</mml:mo>
<mml:mi>&#x03B2;</mml:mi>
</mml:mrow>
<mml:mo>+</mml:mo>
<mml:mi>e</mml:mi>
</mml:mrow>
</mml:mrow>
</mml:math>
</disp-formula>
<sec id="S4.SS2.SSS1">
<title>4.3. k-means clustering</title>
<p>The k-means clustering of movie title numbers was performed by using factors such as category 1, category 2, director, and worldwide box office as indicators. The 100 objects are divided into three classes, and the distance between cluster and cluster clustering centers is calculated continuously and iteratively until the criterion function converges. The squared error criterion is usually used:</p>
<disp-formula id="S4.Ex7"><mml:math id="M7">
<mml:mrow>
<mml:mi>E</mml:mi>
<mml:mo>=</mml:mo>
<mml:mrow>
<mml:munderover>
<mml:mo largeop="true" movablelimits="false" symmetric="true">&#x2211;</mml:mo>
<mml:mrow>
<mml:mi>i</mml:mi>
<mml:mo>=</mml:mo>
<mml:mn>1</mml:mn>
</mml:mrow>
<mml:mi>k</mml:mi>
</mml:munderover>
<mml:mrow>
<mml:munder>
<mml:mo largeop="true" movablelimits="false" symmetric="true">&#x2211;</mml:mo>
<mml:mrow>
<mml:mi>p</mml:mi>
<mml:mo>=</mml:mo>
<mml:msub>
<mml:mi>c</mml:mi>
<mml:mi>i</mml:mi>
</mml:msub>
</mml:mrow>
</mml:munder>
<mml:msup>
<mml:mrow>
<mml:mo stretchy="false">(</mml:mo>
<mml:mrow>
<mml:mi>p</mml:mi>
<mml:mo>-</mml:mo>
<mml:msub>
<mml:mi>m</mml:mi>
<mml:mi>i</mml:mi>
</mml:msub>
</mml:mrow>
<mml:mo stretchy="false">)</mml:mo>
</mml:mrow>
<mml:mn>2</mml:mn>
</mml:msup>
</mml:mrow>
</mml:mrow>
</mml:mrow>
</mml:math>
</disp-formula>
<p>where <italic>E</italic> is the sum of the mean squared differences of all objects in the data with the corresponding cluster centers, <italic>p</italic> represents a point in the object space, and <italic>m</italic><sub><italic>i</italic></sub> is the mean value of class <italic>c</italic><sub><italic>i</italic></sub>.</p>
<p>The k-means algorithm for clustering based on the mean value in the class is as follows.</p>
<list list-type="simple">
<list-item>
<label>1.</label>
<p>Collect the sample set {<italic>x</italic><sub>1</sub>,<italic>x</italic><sub>2</sub>,<italic>x</italic><sub>3</sub>,&#x2026;&#x2026;,<italic>x</italic><sub><italic>n</italic></sub>}, where <italic>n</italic> is the total number of samples is 100, and each sample vector is xj = {xj1, xj2, xj3, &#x2026;, xjm}, xjt is the jth sample tth attribute, total m-dimensional attributes.</p>
</list-item>
<list-item>
<label>2.</label>
<p>Loop through 3 to 4 below until each cluster no longer changes.</p>
</list-item>
<list-item>
<label>3.</label>
<p>Calculate the distance of each object from these central objects based on the mean value of the objects in each cluster (central objects) and re-divide the corresponding objects according to the minimum distance.</p>
</list-item>
<list-item>
<label>4.</label>
<p>Recalculate the mean value of each (changed) cluster.</p>
</list-item>
</list>
</sec>
</sec>
</sec>
<sec id="S5">
<title>5. Model solving and result analysis</title>
<sec id="S5.SS1">
<title>5.1. Log-linear model solving and analysis</title>
<p>The processed experimental data are imported into the r-language software, and then the log-linear model is constructed and solved for the data. The solution results are shown in <xref ref-type="fig" rid="F3">Figure 3</xref>.</p>
<fig id="F3" position="float">
<label>FIGURE 3</label>
<caption><p>Results of the log-linear model solution.</p></caption>
<graphic mimetype="image" mime-subtype="tiff" xlink:href="bijscit-2022-23-g003.tif"/>
</fig>
<p>Solving the results shows that the regression model can be expressed as follows:</p>
<disp-formula id="S5.Ex8"><mml:math id="M8">
<mml:mrow>
<mml:mi>y</mml:mi>
<mml:mo>=</mml:mo>
<mml:mrow>
<mml:mrow>
<mml:mrow>
<mml:mo>-</mml:mo>
<mml:mrow>
<mml:mrow>
<mml:mn>1.539</mml:mn>
<mml:mo>&#x00D7;</mml:mo>
<mml:msup>
<mml:mn>10</mml:mn>
<mml:mrow>
<mml:mo>-</mml:mo>
<mml:mn>2</mml:mn>
</mml:mrow>
</mml:msup>
</mml:mrow>
<mml:mo>&#x2062;</mml:mo>
<mml:msub>
<mml:mi>x</mml:mi>
<mml:mn>2</mml:mn>
</mml:msub>
</mml:mrow>
</mml:mrow>
<mml:mo>-</mml:mo>
<mml:mrow>
<mml:mrow>
<mml:mn>6.350</mml:mn>
<mml:mo>&#x00D7;</mml:mo>
<mml:msup>
<mml:mn>10</mml:mn>
<mml:mrow>
<mml:mo>-</mml:mo>
<mml:mn>3</mml:mn>
</mml:mrow>
</mml:msup>
</mml:mrow>
<mml:mo>&#x2062;</mml:mo>
<mml:msub>
<mml:mi>x</mml:mi>
<mml:mn>3</mml:mn>
</mml:msub>
</mml:mrow>
<mml:mo>-</mml:mo>
<mml:mrow>
<mml:mrow>
<mml:mn>1.144</mml:mn>
<mml:mo>&#x00D7;</mml:mo>
<mml:msup>
<mml:mn>10</mml:mn>
<mml:mrow>
<mml:mo>-</mml:mo>
<mml:mn>2</mml:mn>
</mml:mrow>
</mml:msup>
</mml:mrow>
<mml:mo>&#x2062;</mml:mo>
<mml:msub>
<mml:mi>x</mml:mi>
<mml:mn>4</mml:mn>
</mml:msub>
</mml:mrow>
</mml:mrow>
<mml:mo>+</mml:mo>
<mml:mn>21.21</mml:mn>
</mml:mrow>
</mml:mrow>
</mml:math>
</disp-formula>
<p>As shown in the model solution results, <italic>p</italic><sub>1</sub> &#x003C; 0.01 indicates that <italic>x</italic><sub><italic>2</italic></sub> (type 1) has a significant impact on box office, and the solved coefficients show that genre 1 has a significant negative correlation with box office. It further shows that science fiction, action, and adventure movies are more popular and have higher box office. Therefore, major cinemas can consider further increasing the number of science fiction and action movies introduced. Major film companies can also produce more high-quality science fiction and action movies to increase box office revenue and gain more profits.</p>
<p><italic>p</italic><sub>2</sub> &#x003C; 0.01, indicating that <italic>x</italic><sub><italic>3</italic></sub> (type 2) also has a significant impact on box office, and there is also a significant negative correlation between the two. This indicates that family films are not very popular among the general public, and further analysis shows that not many family films are listed in the top 100 global box office. It also shows that family movies have a low probability of making theaters more profitable. In addition, such films do not provide a good guarantee for the profitability of film companies. Therefore, major cinemas can selectively introduce family movies. Major film companies can also appropriately reduce the output of family movies to avoid the wastage of funds.</p>
<p><italic>p</italic><sub>3</sub> &#x003C; 0.01 also shows that<italic>x</italic><sub><italic>4</italic></sub> (director) also has a significant effect on the box office, and the two become a significant negative correlation. Through the data, it can be illustrated that if the director James Cameron directed the movie, box office can be guaranteed. Therefore, major cinemas can further increase the number of films directed by James Cameron when introducing new films. In addition, major film companies can also invite the director to be a director or ask him to be a director.</p>
</sec>
<sec id="S5.SS2">
<title>5.2. Stochastic unit group design model solving and analysis</title>
<p>The processed data are imported into the r-language software, and the model is constructed and solved for these data. The solution is shown in <xref ref-type="fig" rid="F4">Figure 4</xref>.</p>
<fig id="F4" position="float">
<label>FIGURE 4</label>
<caption><p>Results of solving the random unit group design model.</p></caption>
<graphic mimetype="image" mime-subtype="tiff" xlink:href="bijscit-2022-23-g004.tif"/>
</fig>
<fig id="F5" position="float">
<label>FIGURE 5</label>
<caption><p>K-means clustering results.</p></caption>
<graphic mimetype="image" mime-subtype="tiff" xlink:href="bijscit-2022-23-g005.tif"/>
</fig>
<p>As shown in the figure, <italic>P<sub>x<sub>2</sub></sub></italic> &#x003C; 0.05, indicating that type 1 has a significant effect on box office, and <italic>P<sub>x<sub>4</sub></sub></italic> &#x003C; 0.05, indicating that the director factor also has a significant effect on box office. This is consistent with the conclusion obtained from the linear logit model. However, its <italic>P</italic> results are not as significant as the log-linear model. Therefore, it is more appropriate to consider the dependent variable box office as a categorical variable.</p>
</sec>
<sec id="S5.SS3">
<title>5.3. K-means clustering results and analysis</title>
<p>The required clustering data are brought into the r-language program, and category 1, category 2, director, and worldwide box office are used as clustering indicators for k-means clustering, and the data are first normalized before clustering.</p>
<disp-formula id="S5.Ex9"><mml:math id="M9">
<mml:mrow>
<mml:msub>
<mml:mi>x</mml:mi>
<mml:mn>0</mml:mn>
</mml:msub>
<mml:mo>=</mml:mo>
<mml:mfrac>
<mml:mrow>
<mml:mi>x</mml:mi>
<mml:mo>-</mml:mo>
<mml:msub>
<mml:mi>x</mml:mi>
<mml:mi>min</mml:mi>
</mml:msub>
</mml:mrow>
<mml:mrow>
<mml:msub>
<mml:mi>x</mml:mi>
<mml:mi>max</mml:mi>
</mml:msub>
<mml:mo>-</mml:mo>
<mml:msub>
<mml:mi>x</mml:mi>
<mml:mi>min</mml:mi>
</mml:msub>
</mml:mrow>
</mml:mfrac>
</mml:mrow>
</mml:math>
</disp-formula>
<p>where <italic>x</italic><sub><italic>0</italic></sub> denotes the resulting new data, <italic>x</italic> denotes the original data, <italic>x</italic><sub><italic>min</italic></sub> denotes the minimum data value, and <italic>x</italic><sub><italic>max</italic></sub> denotes the maximum data value. The data were clustered after the standardization process, and the clustering results were as follows.</p>
<p>The top movies in the clustering results are in the same category, and the analysis shows that most of the top movies in the box office are science fiction action movies, and the directors are internationally famous directors, such as James Cameron. Therefore, when introducing movies, cinemas should increase the number of movies introduced for science fiction action movies and movies directed by famous directors, and cooperate with film and television companies to earn higher profits. Film and TV companies can also invest more money in such movies to improve the quality of the films to tap the consumer surplus to earn high money. The films ranked 6&#x2013;21 have a high proportion of science fiction and adventure films, so major cinemas can invest appropriately in science fiction and adventure films to earn more profits. Love, family, and military films are relatively low in the ranking, and the number of films of various categories is low. The high proportion is still science fiction and action films, so it is difficult for these films to enter the top 100 list at the global box office and difficult to hit the top 20 of box office. Therefore, major cinemas should consider carefully when introducing such films, and film companies should make comprehensive consideration for the investment and production of the film in shooting or investment.</p>
</sec>
</sec>
<sec id="S6">
<title>6. Summary and prospect</title>
<p>This study of the top 100 global movie box office data visualizes the data and shows that the box office peaks every 10 years and is expected to peak again in the future through a time series chart. By fitting the data through a log-linear model and randomized unit group design model, it can be seen that science fiction and action movies are mostly high-grossing movies. In addition, family and biography movies occupy a smaller proportion in the top 100 box office list. Therefore, it is suggested that movie companies should invest more money in science fiction and action movies and less money in family movies. Through the cluster analysis, we can see that the directors of high-grossing movies are all internationally renowned directors, and the top-ranking movies at the box office are mostly science fiction action movies. Therefore, the Chinese film and television industry can consider producing high-quality science fiction action movies to impact the box office.</p>
<p>This study still has many shortcomings. The data can be considered to choose a larger amount of data in different dimensions for analysis. In addition, neural networks and other optimization intelligence algorithms can be introduced into it to obtain more profound conclusions. The time column of the data can also be fully utilized to build ar, ma, and other models through time series analysis to predict future data and obtain more accurate conclusions.</p>
</sec>
</body>
<back>
<ref-list>
<title>References</title>
<ref id="B1"><label>1.</label><citation citation-type="journal"><person-group person-group-type="author"><name><surname>Liao</surname> <given-names>L</given-names></name> <name><surname>Huang</surname> <given-names>T</given-names></name></person-group>. <article-title>The effect of different social media marketingchannels and events on movie box office: an elaboration likeihood model perpective.</article-title> <source><italic>Inf Manag.</italic></source> (<year>2021</year>) <volume>58</volume>:<issue>103481</issue>.</citation></ref>
<ref id="B2"><label>2.</label><citation citation-type="journal"><person-group person-group-type="author"><name><surname>Lee</surname> <given-names>Y</given-names></name> <name><surname>Keeling</surname> <given-names>K</given-names></name> <name><surname>Urbaczewski</surname> <given-names>A</given-names></name></person-group>. <article-title>The economic value of online userreviews with ad spending on movie box-office sales.</article-title> <source><italic>Inf Syst Front.</italic></source> (<year>2019</year>) <volume>21</volume>:<fpage>829</fpage>&#x2013;<lpage>44</lpage>.</citation></ref>
<ref id="B3"><label>3.</label><citation citation-type="journal"><person-group person-group-type="author"><name><surname>Lu</surname> <given-names>W</given-names></name> <name><surname>Xing</surname> <given-names>R</given-names></name></person-group>. <article-title>Research on movie box office prediction model with conjoint analysis.</article-title> <source><italic>IntJ Inf Syst Supp Chain Manag.</italic></source> (2019) <volume>12</volume>:<fpage>72</fpage>&#x2013;<lpage>84</lpage>.</citation></ref>
<ref id="B4"><label>4.</label><citation citation-type="journal"><person-group person-group-type="author"><name><surname>Hu</surname> <given-names>Y</given-names></name> <name><surname>Shiau</surname> <given-names>W</given-names></name> <name><surname>Shih</surname> <given-names>S</given-names></name> <name><surname>Chen</surname> <given-names>C</given-names></name></person-group>. <article-title>Considering online consumerreviews to predict movie box-office performance between the years 2009 and 2014 in the US.</article-title> <source><italic>Electron Lib.</italic></source> (<year>2018</year>) <volume>36</volume>:<fpage>1010</fpage>&#x2013;<lpage>26</lpage>.</citation></ref>
<ref id="B5"><label>5.</label><citation citation-type="journal"><person-group person-group-type="author"><name><surname>Oh</surname> <given-names>C</given-names></name> <name><surname>Roumani</surname> <given-names>Y</given-names></name> <name><surname>Nwankpa</surname> <given-names>J</given-names></name> <name><surname>Hue</surname> <given-names>H</given-names></name></person-group>. <article-title>Beyond likes and tweets: Consumer engagement behavior and movie box office in social media.</article-title> <source><italic>Inf Manag.</italic></source> (<year>2017</year>) <volume>54</volume>:<fpage>25</fpage>&#x2013;<lpage>37</lpage>.</citation></ref>
<ref id="B6"><label>6.</label><citation citation-type="journal"><person-group person-group-type="author"><name><surname>Zhou</surname> <given-names>C</given-names></name> <name><surname>Leng</surname> <given-names>M</given-names></name> <name><surname>Liu</surname> <given-names>Z</given-names></name> <name><surname>Cui</surname> <given-names>X</given-names></name> <name><surname>Yu</surname> <given-names>J</given-names></name></person-group>. <article-title>The impact of recommender systems and pricing strategies on brand competition and consumer search.</article-title> <source><italic>Electron Commer Res Appl.</italic></source> (<year>2022</year>) <volume>53</volume>:<fpage>1</fpage>&#x2013;<lpage>15</lpage>.</citation></ref>
<ref id="B7"><label>7.</label><citation citation-type="journal"><person-group person-group-type="author"><name><surname>Liu</surname> <given-names>Y</given-names></name></person-group>. <article-title>Word of mouth for movies: its dynamics and impact on box office revenue.</article-title> <source><italic>J Market.</italic></source> (<year>2006</year>) <volume>70</volume>:<fpage>74</fpage>&#x2013;<lpage>89</lpage>.</citation></ref>
<ref id="B8"><label>8.</label><citation citation-type="journal"><person-group person-group-type="author"><name><surname>Song</surname> <given-names>T</given-names></name> <name><surname>Huang</surname> <given-names>J</given-names></name> <name><surname>Tan</surname> <given-names>T</given-names></name> <name><surname>Yu</surname> <given-names>Y</given-names></name></person-group>. <article-title>Using user-and marketer-generated content for box office revenue prediction: Differences between microblogging and third-party platforms.</article-title> <source><italic>Inf Syst Res.</italic></source> (<year>2019</year>) <volume>30</volume>:<fpage>191</fpage>&#x2013;<lpage>203</lpage>.</citation></ref>
<ref id="B9"><label>9.</label><citation citation-type="journal"><person-group person-group-type="author"><name><surname>Mbunge</surname> <given-names>E</given-names></name> <name><surname>Fashoto</surname> <given-names>S</given-names></name> <name><surname>Bimha</surname> <given-names>H</given-names></name></person-group>. <article-title>Prediction of box-office success: a review of trends and machine learning computational models.</article-title> <source><italic>Int J Bus Intell Data Min.</italic></source> (<year>2022</year>) <volume>20</volume>:<fpage>192</fpage>&#x2013;<lpage>207</lpage>.</citation></ref>
<ref id="B10"><label>10.</label><citation citation-type="journal"><person-group person-group-type="author"><name><surname>Zhou</surname> <given-names>C</given-names></name> <name><surname>Ma</surname> <given-names>N</given-names></name> <name><surname>Cui</surname> <given-names>X</given-names></name> <name><surname>Liu</surname> <given-names>Z</given-names></name></person-group>. <article-title>The impact of online referral on brand market strategies with consumer search and spillover effect.</article-title> <source><italic>Soft Comput.</italic></source> (<year>2020</year>) <volume>24</volume>:<fpage>2551</fpage>&#x2013;<lpage>65</lpage>.</citation></ref>
<ref id="B11"><label>11.</label><citation citation-type="journal"><person-group person-group-type="author"><name><surname>Ahmed</surname> <given-names>U</given-names></name> <name><surname>Waqas</surname> <given-names>H</given-names></name> <name><surname>Afzal</surname> <given-names>MT</given-names></name></person-group>. <article-title>Pre-production box-office success quotient forecasting.</article-title> <source><italic>Soft Comput.</italic></source> (<year>2020</year>) <volume>24</volume>:<fpage>6635</fpage>&#x2013;<lpage>53</lpage>.</citation></ref>
<ref id="B12"><label>12.</label><citation citation-type="journal"><person-group person-group-type="author"><name><surname>Kang</surname> <given-names>L</given-names></name> <name><surname>Peng</surname> <given-names>F</given-names></name> <name><surname>Anwar</surname> <given-names>S</given-names></name></person-group>. <article-title>All that glitters is not gold: do movie quality and contents influence box-office revenues in China?</article-title> <source><italic>J Policy Model.</italic></source> (<year>2022</year>) <volume>44</volume>:<fpage>492</fpage>&#x2013;<lpage>510</lpage>.</citation></ref>
<ref id="B13"><label>13.</label><citation citation-type="journal"><person-group person-group-type="author"><name><surname>Zhou</surname> <given-names>C</given-names></name> <name><surname>Tang</surname> <given-names>W</given-names></name> <name><surname>Zhao</surname> <given-names>R</given-names></name></person-group>. <article-title>Optimal consumption with reference-dependent preferences in on-the-job search and savings.</article-title> <source><italic>J Ind Manag Opt.</italic></source> (<year>2017</year>) <volume>13</volume>(<issue>1</issue>):<fpage>503</fpage>&#x2013;<lpage>27</lpage>.</citation></ref>
<ref id="B14"><label>14.</label><citation citation-type="journal"><person-group person-group-type="author"><name><surname>Wibowo</surname> <given-names>BS</given-names></name> <name><surname>Rubiana</surname> <given-names>F</given-names></name> <name><surname>Hartono</surname> <given-names>B</given-names></name></person-group>. <article-title>A data-driven investigation of successful local film profiles in the Indonesian box office.</article-title> <source><italic>J Manajemen Indones.</italic></source> (<year>2022</year>) <volume>22</volume>:<fpage>333</fpage>&#x2013;<lpage>44</lpage>.</citation></ref>
<ref id="B15"><label>15.</label><citation citation-type="journal"><person-group person-group-type="author"><name><surname>Ghazali</surname> <given-names>L</given-names></name> <name><surname>Islam</surname> <given-names>R</given-names></name></person-group>. <article-title>Critical determinants of box office success for the Malaysian film industry.</article-title> <source><italic>Int J Bus Syst Res.</italic></source> (<year>2021</year>) <volume>15</volume>:<fpage>491</fpage>&#x2013;<lpage>509</lpage>.</citation></ref>
<ref id="B16"><label>16.</label><citation citation-type="journal"><person-group person-group-type="author"><name><surname>Liu</surname> <given-names>Z</given-names></name></person-group>. <article-title>Impact of cost uncertainty on supply chain competition under different confidence levels.</article-title> <source><italic>Int Trans Operat Res.</italic></source> (<year>2021</year>) <volume>28</volume>:<fpage>1465</fpage>&#x2013;<lpage>504</lpage>.</citation></ref>
<ref id="B17"><label>17.</label><citation citation-type="journal"><person-group person-group-type="author"><name><surname>Chen</surname> <given-names>P</given-names></name> <name><surname>Zhao</surname> <given-names>R</given-names></name> <name><surname>Yan</surname> <given-names>Y</given-names></name> <name><surname>Zhou</surname> <given-names>C</given-names></name></person-group>. <article-title>Promoting end-of-season product through online channel in an uncertain market.</article-title> <source><italic>Eur J Operat Res.</italic></source> (<year>2021</year>) <volume>295</volume>:<fpage>935</fpage>&#x2013;<lpage>48</lpage>.</citation></ref>
<ref id="B18"><label>18.</label><citation citation-type="journal"><person-group person-group-type="author"><name><surname>Li</surname> <given-names>P</given-names></name> <name><surname>Dong</surname> <given-names>Q</given-names></name></person-group>. <article-title>Box office prediction model based on web search data and machine learning.</article-title> <source><italic>Operat Res Manag Sci.</italic></source> (<year>2021</year>) <volume>30</volume>:<issue>168</issue>.</citation></ref>
<ref id="B19"><label>19.</label><citation citation-type="journal"><person-group person-group-type="author"><name><surname>Oomiya</surname> <given-names>N</given-names></name> <name><surname>Nakamura</surname> <given-names>Y</given-names></name></person-group>. <article-title>Proposal of an estimate of box office revenues using a movie scripts&#x2013;case of romance movies in Japan.</article-title> <source><italic>J Jap Soc Fuzzy Theor Intell Inf.</italic></source> (<year>2020</year>) <volume>32</volume>:<fpage>935</fpage>&#x2013;<lpage>43</lpage>.</citation></ref>
<ref id="B20"><label>20.</label><citation citation-type="journal"><person-group person-group-type="author"><name><surname>Boisvert</surname> <given-names>S</given-names></name></person-group>. <article-title>Les Hommes du box-office qu&#x00E9;b&#x00E9;cois: la construction s&#x00E9;rielle du genre dans les sequels Nitro Rush et Les 3 p&#x2019;tits cochons 2.</article-title> <source><italic>Nouvelles &#x00C9;tudes Francophones.</italic></source> (<year>2020</year>) <volume>35</volume>:<fpage>101</fpage>&#x2013;<lpage>16</lpage>.</citation></ref>
<ref id="B21"><label>21.</label><citation citation-type="journal"><person-group person-group-type="author"><name><surname>Barnett</surname> <given-names>VL</given-names></name></person-group>. <article-title>Super Fly (1972), Coffy (1973) and The Mack (1973): under-and over-estimating blaxploitation box office.</article-title> <source><italic>Hist J Film Radio Telev.</italic></source> (<year>2020</year>) <volume>40</volume>:<fpage>373</fpage>&#x2013;<lpage>88</lpage>.</citation></ref>
<ref id="B22"><label>22.</label><citation citation-type="journal"><person-group person-group-type="author"><name><surname>Koo</surname> <given-names>HY</given-names></name> <name><surname>Lee</surname> <given-names>HJ</given-names></name> <name><surname>Lee</surname> <given-names>G</given-names></name></person-group>. <article-title>Influence of movie success factors including holdbacks in box office and VOD Market. </article-title><source><italic>Korean Manag Sci Rev.</italic></source> (<year>2021</year>) <volume>38</volume>:<fpage>47</fpage>&#x2013;<lpage>61</lpage>.</citation></ref>
<ref id="B23"><label>23.</label><citation citation-type="journal"><person-group person-group-type="author"><name><surname>Chu</surname> <given-names>M</given-names></name></person-group>. <article-title>The impact of online referral services on cooperation modes between brander and platform&#x002A;.</article-title> <source><italic>J Ind Manag Optim.</italic></source> (2022). <pub-id pub-id-type="doi">10.3934/jimo.2022174</pub-id></citation></ref>
<ref id="B24"><label>24.</label><citation citation-type="journal"><person-group person-group-type="author"><name><surname>Yu</surname> <given-names>J</given-names></name> <name><surname>Zhao</surname> <given-names>J</given-names></name> <name><surname>Zhou</surname> <given-names>C</given-names></name> <name><surname>Ren</surname> <given-names>Y</given-names></name></person-group>. <article-title>Strategic business mode choices for e-commerce platforms under brand competition.</article-title> <source><italic>J Theor Appl Electron Comm Res.</italic></source> (<year>2022</year>) <volume>17</volume>:<fpage>1769</fpage>&#x2013;<lpage>90</lpage>.</citation></ref>
<ref id="B25"><label>25.</label><citation citation-type="journal"><person-group person-group-type="author"><name><surname>Yu</surname> <given-names>J</given-names></name> <name><surname>Song</surname> <given-names>Z</given-names></name></person-group>. <article-title>Self-supporting or third-party? The optimal delivery strategy selection decision for e-tailers under competition&#x002A;.</article-title> <source><italic>Kybernetes.</italic></source> (2022). <pub-id pub-id-type="doi">10.1108/K-02-2022-0216</pub-id></citation></ref>
<ref id="B26"><label>26.</label><citation citation-type="journal"><person-group person-group-type="author"><name><surname>Wang</surname> <given-names>X</given-names></name> <name><surname>Pan</surname> <given-names>HR</given-names></name> <name><surname>Zhu</surname> <given-names>N</given-names></name> <name><surname>Cai</surname> <given-names>S</given-names></name></person-group>. <article-title>East Asian films in the European market: the roles of cultural distance and cultural specificity.</article-title> <source><italic>Int Market Rev.</italic></source> (<year>2021</year>) <volume>38</volume>:<fpage>717</fpage>&#x2013;<lpage>35</lpage>.</citation></ref>
<ref id="B27"><label>27.</label><citation citation-type="journal"><person-group person-group-type="author"><name><surname>Zhou</surname> <given-names>C</given-names></name> <name><surname>Tang</surname> <given-names>W</given-names></name> <name><surname>Zhao</surname> <given-names>R</given-names></name></person-group>. <article-title>Optimal consumer search with prospect utility in hybrid uncertain environment.</article-title> <source><italic>J Uncertainty Anal Appl.</italic></source> (<year>2015</year>) <volume>3</volume>:<fpage>1</fpage>&#x2013;<lpage>20</lpage>.</citation></ref>
<ref id="B28"><label>28.</label><citation citation-type="journal"><person-group person-group-type="author"><name><surname>Maulud</surname> <given-names>D</given-names></name> <name><surname>Abdulazeez</surname> <given-names>AM</given-names></name></person-group>. <article-title>A review on linear regression comprehensive in machine learning.</article-title> <source><italic>J Appl Sci Technol Trends.</italic></source> (<year>2020</year>) <volume>1</volume>:<fpage>140</fpage>&#x2013;<lpage>7</lpage>.</citation></ref>
<ref id="B29"><label>29.</label><citation citation-type="journal"><person-group person-group-type="author"><name><surname>Filzmoser</surname> <given-names>P</given-names></name> <name><surname>Nordhausen</surname> <given-names>K</given-names></name></person-group>. <article-title>Robust linear regression for high-dimensional data: an overview.</article-title> <source><italic>Wiley Interdisc Rev Comput Stat.</italic></source> (<year>2021</year>) <volume>13</volume>:<issue>e1524</issue>.</citation></ref>
<ref id="B30"><label>30.</label><citation citation-type="journal"><person-group person-group-type="author"><name><surname>Yuan</surname> <given-names>C</given-names></name> <name><surname>Yang</surname> <given-names>H</given-names></name></person-group>. <article-title>Research on K-value selection method of K-means clustering algorithm.</article-title> <source><italic>Multidisc Sci J.</italic></source> (<year>2019</year>) <volume>2</volume>:<fpage>226</fpage>&#x2013;<lpage>35</lpage>.</citation></ref>
</ref-list>
</back>
</article>
