Submit to this Journal Review for this Journal Propose a Special Issue

Article Menu

Share Help Cite Discuss in SciProfiles

Open AccessArticle

Peer-Review Record

NRPerson: A Non-Registered Multi-Modal Benchmark for Tiny Person Detection and Localization

Electronics 2024, 13(9), 1697; https://doi.org/10.3390/electronics13091697

by Yi Yang^†, Xumeng Han^†, Kuiran Wang

, Xuehui Yu, Wenwen Yu, Zipeng Wang, Guorong Li

, Zhenjun Han^*

and Jianbin Jiao

Reviewer 1:

Roberta Vrskova

Reviewer 2:

Muhammad Munsif

Reviewer 3: Anonymous

Electronics 2024, 13(9), 1697; https://doi.org/10.3390/electronics13091697

Submission received: 30 March 2024 / Revised: 17 April 2024 / Accepted: 22 April 2024 / Published: 27 April 2024

(This article belongs to the Special Issue Big Model Techniques for Image Processing)

Round 1

Reviewer 1 Report

Comments and Suggestions for Authors

The authors of this paper create an unregistered multimodal benchmark for small person detection and localization called NRPerson. The article is very well prepared, but I only have the following reservations about it:
- I would recommend rearranging the layout of the chapters, e.g., some subsections that have only a few lines seem useless to me as subsections.
- the authors also created a benchmark, which, if they would like to share with the scientific community, would be good to link to.
-at the same time, the authors could add information about their plans for how they want to continue their research.

Author Response

Please see the attachment.

Author Response File: Author Response.pdf

Reviewer 2 Report

Comments and Suggestions for Authors

This paper introduces establishing a non-registered multi-modal benchmark named NRPerson to push forward the frontiers of multi-modal tiny person detection and localization, making it more applicable to real-world scenarios. While the topic is intriguing, the manuscript requires significant revisions. The following concerns necessitate the author's careful attention:

1. The language proficiency of the paper needs improvement and should be reviewed by a professional proofreader for clarity and coherence.

2. The title is confusing; please refrain from using abbreviated forms in the title to prevent potential confusion among readers.

3. The abstract offers a general overview but lacks the drawbacks of existing SLR systems.

4. The introduction is concise, and the motivation for the study is unclear, potentially confusing readers. A more detailed explanation of the study's purpose and objectives is recommended.

5. The literature presented in the introduction is outdated and too general. Include some recent (2022-2024) and relevant literature related to NRP and its specific drawbacks.

6. The contribution statement is weak; the mere mention of using a library is insufficient. A more robust and professionally written explanation of contributions is needed.

7. In the literature section, should be include recent works (2022-2024) that are closely related to the topic.

8. Verify the results by comparing the proposed model with transformer-based models like SegFormer and UperNet.

9. Future work is missing in the conclusion.

10. Throughout the content, especially in the introduction and method sections, there are instances of uncited content. Ensure proper citation, referencing works such as https://www.sciencedirect.com/science/article/pii/S0957417423009673 and “https://doi.org/10.22967/HCIS.2024.14.004” to substantiate claims and give credit to existing research.

Comments on the Quality of English Language

The language proficiency of the paper needs improvement and should be reviewed by a professional proofreader for clarity and coherence.

Author Response

Please see the attachment.

Author Response File: Author Response.pdf

Reviewer 3 Report

Comments and Suggestions for Authors

In this article, a non registered multimodal benchmark called NRPerson is established, which introduces non registered based on multimodal micro human detection and localization. The pseudo box generated through precise point annotation uses a unified framework to detect and locate objects in non registered multimodal images. However, there are still some issues that need improvement.

1. More video sequences can be used to increase the practicality of different scenarios, and this data is crucial for practicality and reliability

2. The article lacks in-depth analysis and discussion on small person detection and localization techniques. Small person detection and localization involve multiple key technical points, such as feature extraction, object detection, multimodal fusion, etc. The article should delve into the application and challenges of these technical points on the NRPerson dataset, and propose targeted solutions.

3. Although there has been in-depth discussion on the selection criteria and optimization process of the model, it is necessary to increase the validation of model performance. In addition, there is insufficient discussion on the possibilities and limitations of the NRPerson dataset in practical applications, and there is a lack of case studies on practical application scenarios.

4. This article did not delve into the performance differences of the model on different subsets of the dataset, nor did it identify potential challenges and issues in the dataset.

Comments on the Quality of English Language

1. More video sequences can be used to increase the practicality of different scenarios, and this data is crucial for practicality and reliability

4. This article did not delve into the performance differences of the model on different subsets of the dataset, nor did it identify potential challenges and issues in the dataset.

Article Menu

NRPerson: A Non-Registered Multi-Modal Benchmark for Tiny Person Detection and Localization

Further Information

Guidelines

MDPI Initiatives

Follow MDPI