Topic Editors

School of Business, Deree—The American College of Greece, 6 Gravias Street, GR-153 42 Aghia Paraskevi, Greece
Faculty of Theoretical and Applied Economics, The Bucharest University of Economic Studies, Romana Square, No. 6, 010374 Bucharest, Romania

Big Data and Artificial Intelligence, 2nd Volume

Abstract submission deadline
closed (31 January 2024)
Manuscript submission deadline
closed (31 March 2024)
Viewed by
7276

Topic Information

Dear Colleagues,

The evolution of research in Big Data and artificial intelligence in recent years challenges almost all domains of human activity. The potential of artificial intelligence to act as a catalyst for all given business models, and the capacity of Big Data research to provide sophisticated data and services ecosystems at a global scale, provide a challenging context for scientific contributions and applied research. This Topic section promotes scientific dialogue for the added value of novel methodological approaches and research in the specified areas. Our interest is on the entire end-to-end spectrum of Big Data and artificial intelligence research, from social sciences to computer science including, strategic frameworks, models, and best practices, to sophisticated research related to radical innovation. The topics include, but are not limited to, the following indicative list:

  • Enabling Technologies for Big Data and AI research:
    • Data warehouses;
    • Business intelligence;
    • Machine learning;
    • Neural networks;
    • Natural language processing;
    • Image processing;
    • Bot technology;
    • AI agents;
    • Analytics and dashboards;
    • Distributed computing;
    • Edge computing.
  • Methodologies, frameworks, and models for artificial intelligence and Big Data research:
    • Towards sustainable development goals;
    • As responses to social problems and challenges;
    • For innovations in business, research, academia industry, and technology;
    • For theoretical foundations and contributions to the bodyf knowledgef AI and Big Data research.
  • Best practices and use cases;
  • Outcomesf R&D projects;
  • Advanced data science analytics;
  • Industry-government collaboration;
  • Systemsf information systems;
  • Interoperability issues;
  • Security and privacy issues;
  • Ethicsn Big Data and AI;
  • Social impactf AI;
  • Open data.

Prof. Dr. Miltiadis D. Lytras
Prof. Dr. Andreea Claudia Serban
Topic Editors

 

Keywords

  • artificial intelligence
  • big data
  • machine learning
  • open data
  • decision making

Participating Journals

Journal Name Impact Factor CiteScore Launched Year First Decision (median) APC
Big Data and Cognitive Computing
BDCC
3.7 4.9 2017 18.2 Days CHF 1800
Economies
economies
2.6 3.2 2013 21.4 Days CHF 1800
Information
information
3.1 5.8 2010 18 Days CHF 1600
Remote Sensing
remotesensing
5.0 7.9 2009 23 Days CHF 2700
Sustainability
sustainability
3.9 5.8 2009 18.8 Days CHF 2400

Preprints.org is a multidiscipline platform providing preprint service that is dedicated to sharing your research from the start and empowering your research journey.

MDPI Topics is cooperating with Preprints.org and has built a direct connection between MDPI journals and Preprints.org. Authors are encouraged to enjoy the benefits by posting a preprint at Preprints.org prior to publication:

  1. Immediately share your ideas ahead of publication and establish your research priority;
  2. Protect your idea from being stolen with this time-stamped preprint article;
  3. Enhance the exposure and impact of your research;
  4. Receive feedback from your peers in advance;
  5. Have it indexed in Web of Science (Preprint Citation Index), Google Scholar, Crossref, SHARE, PrePubMed, Scilit and Europe PMC.

Published Papers (5 papers)

Order results
Result details
Journals
Select all
Export citation of selected articles as:
20 pages, 6873 KiB  
Article
PD-LL-Transformer: An Hourly PM2.5 Forecasting Method over the Yangtze River Delta Urban Agglomeration, China
by Rongkun Zou, Heyun Huang, Xiaoman Lu, Fanmei Zeng, Chu Ren, Weiqing Wang, Liguo Zhou and Xiaoyan Dai
Remote Sens. 2024, 16(11), 1915; https://doi.org/10.3390/rs16111915 - 27 May 2024
Viewed by 220
Abstract
As the urgency of PM2.5 prediction becomes increasingly ingrained in public awareness, deep-learning methods have been widely used in forecasting concentration trends of PM2.5 and other atmospheric pollutants. Traditional time-series forecasting models, like long short-term memory (LSTM) and temporal convolutional network [...] Read more.
As the urgency of PM2.5 prediction becomes increasingly ingrained in public awareness, deep-learning methods have been widely used in forecasting concentration trends of PM2.5 and other atmospheric pollutants. Traditional time-series forecasting models, like long short-term memory (LSTM) and temporal convolutional network (TCN), were found to be efficient in atmospheric pollutant estimation, but either the model accuracy was not high enough or the models encountered certain challenges due to their own structure or some specific application scenarios. This study proposed a high-accuracy, hourly PM2.5 forecasting model, poly-dimensional local-LSTM Transformer, namely PD-LL-Transformer, by deep-learning methods, based on air pollutant data and meteorological data, and aerosol optical depth (AOD) data retrieved from the Himawari-8 satellite. This research was based on the Yangtze River Delta Urban Agglomeration (YRDUA), China for 2020–2022. The PD-LL-Transformer had three parts: a poly-dimensional embedding layer, which integrated the advantages of allocating and embedding multi-variate features in a more refined manner and combined the superiority of different temporal processing methods; a local-LSTM block, which combined the advantages of LSTM and TCN; and a Transformer encoder block. Over the test set (the whole year of 2022), the model’s R2 was 0.8929, mean absolute error (MAE) was 4.4523 µg/m3, and root mean squared error (RMSE) was 7.2683 µg/m3, showing great accuracy for PM2.5 prediction. The model surpassed other existing models upon the same tasks and similar datasets, with the help of which a PM2.5 forecasting tool with better performance and applicability could be established. Full article
(This article belongs to the Topic Big Data and Artificial Intelligence, 2nd Volume)
Show Figures

Figure 1

14 pages, 4246 KiB  
Article
Image-Based Leaf Disease Recognition Using Transfer Deep Learning with a Novel Versatile Optimization Module
by Petra Radočaj, Dorijan Radočaj and Goran Martinović
Big Data Cogn. Comput. 2024, 8(6), 52; https://doi.org/10.3390/bdcc8060052 - 23 May 2024
Viewed by 342
Abstract
Due to the projected increase in food production by 70% in 2050, crops should be additionally protected from diseases and pests to ensure a sufficient food supply. Transfer deep learning approaches provide a more efficient solution than traditional methods, which are labor-intensive and [...] Read more.
Due to the projected increase in food production by 70% in 2050, crops should be additionally protected from diseases and pests to ensure a sufficient food supply. Transfer deep learning approaches provide a more efficient solution than traditional methods, which are labor-intensive and struggle to effectively monitor large areas, leading to delayed disease detection. This study proposed a versatile module based on the Inception module, Mish activation function, and Batch normalization (IncMB) as a part of deep neural networks. A convolutional neural network (CNN) with transfer learning was used as the base for evaluated approaches for tomato disease detection: (1) CNNs, (2) CNNs with a support vector machine (SVM), and (3) CNNs with the proposed IncMB module. In the experiment, the public dataset PlantVillage was used, containing images of six different tomato leaf diseases. The best results were achieved by the pre-trained InceptionV3 network, which contains an IncMB module with an accuracy of 97.78%. In three out of four cases, the highest accuracy was achieved by networks containing the proposed IncMB module in comparison to evaluated CNNs. The proposed IncMB module represented an improvement in the early detection of plant diseases, providing a basis for timely leaf disease detection. Full article
(This article belongs to the Topic Big Data and Artificial Intelligence, 2nd Volume)
Show Figures

Figure 1

16 pages, 3440 KiB  
Article
Leveraging Remotely Sensed and Climatic Data for Improved Crop Yield Prediction in the Chi Basin, Thailand
by Akkarapon Chaiyana, Ratchawatch Hanchoowong, Neti Srihanu, Haris Prasanchum, Anongrit Kangrang, Rattana Hormwichian, Siwa Kaewplang, Werapong Koedsin and Alfredo Huete
Sustainability 2024, 16(6), 2260; https://doi.org/10.3390/su16062260 - 8 Mar 2024
Viewed by 767
Abstract
Predictions of crop production in the Chi basin are of major importance for decision support tools in countries such as Thailand, which aims to increase domestic income and global food security by implementing the appropriate policies. This research aims to establish a predictive [...] Read more.
Predictions of crop production in the Chi basin are of major importance for decision support tools in countries such as Thailand, which aims to increase domestic income and global food security by implementing the appropriate policies. This research aims to establish a predictive model for predicting crop production for an internal crop growth season prior to harvest at the province scale for fourteen provinces in Thailand’s Chi basin between 2011 and 2019. We provide approaches for reducing redundant variables and multicollinearity in remotely sensed (RS) and meteorological data to avoid overfitting models using correlation analysis (CA) and the variance inflation factor (VIF). The temperature condition index (TCI), the normalized difference vegetation index (NDVI), land surface temperature (LSTnighttime), and mean temperature (Tmean) were the resulting variables in the prediction model with a p-value < 0.05 and a VIF < 5. The baseline data (2011–2017: June to November) were used to train four regression models, which revealed that eXtreme Gradient Boosting (XGBoost), random forest (RF), and XGBoost achieved R2 values of 0.95, 0.94, and 0.93, respectively. In addition, the testing dataset (2018–2019) displayed a minimum root-mean-square error (RMSE) of 0.18 ton/ha for the optimal solution by integrating variables and applying the XGBoost model. Accordingly, it is estimated that between 2020 and 2022, the total crop production in the Chi basin region will be 7.88, 7.64, and 7.72 million tons, respectively. The results demonstrated that the proposed model is proficient at greatly improving crop yield prediction accuracy when compared to a conventional regression method and that it may be deployed in different regions to assist farmers and policymakers in making more informed decisions about agricultural practices and resource allocation. Full article
(This article belongs to the Topic Big Data and Artificial Intelligence, 2nd Volume)
Show Figures

Figure 1

22 pages, 2151 KiB  
Article
Enhancing Supervised Model Performance in Credit Risk Classification Using Sampling Strategies and Feature Ranking
by Niwan Wattanakitrungroj, Pimchanok Wijitkajee, Saichon Jaiyen, Sunisa Sathapornvajana and Sasiporn Tongman
Big Data Cogn. Comput. 2024, 8(3), 28; https://doi.org/10.3390/bdcc8030028 - 6 Mar 2024
Viewed by 1603
Abstract
For the financial health of lenders and institutions, one important risk assessment called credit risk is about correctly deciding whether or not a borrower will fail to repay a loan. It not only helps in the approval or denial of loan applications but [...] Read more.
For the financial health of lenders and institutions, one important risk assessment called credit risk is about correctly deciding whether or not a borrower will fail to repay a loan. It not only helps in the approval or denial of loan applications but also aids in managing the non-performing loan (NPL) trend. In this study, a dataset provided by the LendingClub company based in San Francisco, CA, USA, from 2007 to 2020 consisting of 2,925,492 records and 141 attributes was experimented with. The loan status was categorized as “Good” or “Risk”. To yield highly effective results of credit risk prediction, experiments on credit risk prediction were performed using three widely adopted supervised machine learning techniques: logistic regression, random forest, and gradient boosting. In addition, to solve the imbalanced data problem, three sampling algorithms, including under-sampling, over-sampling, and combined sampling, were employed. The results show that the gradient boosting technique achieves nearly perfect Accuracy, Precision, Recall, and F1score values, which are better than 99.92%, but its MCC values are greater than 99.77%. Three imbalanced data handling approaches can enhance the model performance of models trained by three algorithms. Moreover, the experiment of reducing the number of features based on mutual information calculation revealed slightly decreasing performance for 50 data features with Accuracy values greater than 99.86%. For 25 data features, which is the smallest size, the random forest supervised model yielded 99.15% Accuracy. Both sampling strategies and feature selection help to improve the supervised model for accurately predicting credit risk, which may be beneficial in the lending business. Full article
(This article belongs to the Topic Big Data and Artificial Intelligence, 2nd Volume)
Show Figures

Figure 1

23 pages, 4267 KiB  
Article
Nowcasting Unemployment Using Neural Networks and Multi-Dimensional Google Trends Data
by Andrius Grybauskas, Vaida Pilinkienė, Mantas Lukauskas, Alina Stundžienė and Jurgita Bruneckienė
Economies 2023, 11(5), 130; https://doi.org/10.3390/economies11050130 - 25 Apr 2023
Cited by 2 | Viewed by 2802
Abstract
This article forms an attempt to expand the ability of online search queries to predict initial jobless claims in the United States and further explore the intricacies of Google Trends. In contrast to researchers who used only a small number of search queries [...] Read more.
This article forms an attempt to expand the ability of online search queries to predict initial jobless claims in the United States and further explore the intricacies of Google Trends. In contrast to researchers who used only a small number of search queries or limited themselves to job agency explorations, we incorporated keywords from the following six dimensions of Google Trends searches: job search, benefits, and application; mental health; violence and abuse; leisure search; consumption and lifestyle; and disasters. We also propose the use of keyword optimization, dimension reduction techniques, and long-short memory neural networks to predict future initial claims changes. The findings suggest that including Google Trends keywords from other dimensions than job search leads to the improved forecasting of errors; however, the relationship between jobless claims and specific Google keywords is unstable in relation to time. Full article
(This article belongs to the Topic Big Data and Artificial Intelligence, 2nd Volume)
Show Figures

Figure 1

Back to TopTop