Leeds Beckett University - City Campus,
Woodhouse Lane,
LS1 3HE
Dr Akbar Sheikh Akbari
Reader
Dr Akbar Sheikh Akbari joined LBU in 2015. His research interest includes biometric identification techniques, hyperspectral image processing, colour balancing, image source camera identification, image super-resolution, image/video codecs, multiview image/video processing, assisted living technologies, deep learning, and artificial intelligence.
About
Dr Akbar Sheikh Akbari joined LBU in 2015. His research interest includes biometric identification techniques, hyperspectral image processing, colour balancing, image source camera identification, image super-resolution, image/video codecs, multiview image/video processing, assisted living technologies, deep learning, and artificial intelligence.
Dr Akbar Sheikh Akbari joined LBU in 2015. His research interest includes biometric identification techniques, hyperspectral image processing, colour balancing, image source camera identification, image super-resolution, image/video codecs, multiview image/video processing, assisted living technologies, deep learning, and artificial intelligence.
Dr Sheikh Akbari holds a BSc (Hons), an MSc (Distinction), and a PhD in Electronic and Electrical Engineering. After completing his PhD at Strathclyde University, he made significant contributions to an EPSRC project focused on stereo/multi-view video processing at Bristol University. His diverse career includes work in the industry, particularly in the field of real-time embedded video analytics systems. Notably, he has successfully supervised ten PhD and two MRes students, and his research output comprises over 100 peer-reviewed international journal and conference papers.
Akbar's recent funded projects are:
- Leeds Beckett University and Fathers Farm Foods Ltd, KTP23_24R3, £294,274, (2023, 33 months). Akbar Sheikh-Akbari (lead academic), Theocharis Ispoglou (academic supervisors from the School of Sport), and John George (academic supervisor from the School of Health) Developing novel hyper-spectral imaging capability to screen for aflatoxins in pistachios
- Leeds Beckett University and Riverside Greetings mKTP 21_22 R4, £169,433.00, (2022, 24 months). Nick Halafihi (lead academic), Akbar Sheikh-Akbari, and Esther Pugh (academic supervisors). Project number: 10024497, Application of RFIDs for greeting cards' asset management
- KTP 10304, Innovate UK: £122,040.00, (July 2016-2018). Research, develop, and implement a scalable and modular system that monitors and analyses individual behavioral patterns and movements in a range of environments, Omega Security Systems/Leeds Beckett University. Akbar Sheikh-Akbari (academic supervisor). This Partnership was graded 'Outstanding' by Innovate UK
Academic positions
Reader (Associate Professor)
Leeds Beckett University, School Of Built Environment, Engineering And Computing, Leeds, United Kingdom | 01 September 2019 - presentSenior Lecturer in Control Engineering & Robotics
Leeds Metropolitan University, Leeds, United Kingdom | February 2015 - 31 August 2019Senior Research Fellow (Director of Research Centre in Applied Design for Business)
University of Gloucestershire, Cheltenham, United Kingdom | 01 October 2014 - 01 January 2015Lecture
Staffordshire University, Stoke-on-Trent, United Kingdom | 01 June 2011 - 01 September 2014Senior Research Officer in Digital Processing
Staffordshire University, Stoke-on-Trent, United Kingdom | 01 February 2010 - 01 May 2011Research Associate
University of Bristol, Bristol, United Kingdom | 01 May 2005 - 01 June 2008
Degrees
PhD
University of Strathclyde, United Kingdom | 05 February 2001 - 17 December 2004MSc with distinction
Amirkabir University of Technology, Tehran, Iran | 23 September 1992 - 10 September 1995BSc with first class honours
University of Sistan and Baluchestan, Zahedan, Iran | 23 September 1988 - 13 April 1992
Languages
English
Can read, write, speak, understand and peer reviewPersian
Can read, write, speak, understand and peer review
Research interests
Dr Sheikh Akbari's research interest includes biometric identification techniques, hyperspectral image processing, colour balancing, image source camera identification, image super-resolution, image/video codecs, multiview image/video processing, assisted living technologies, deep learning, and artificial intelligence.
Publications (146)
Sort By:
Featured First:
Search:
The use of the wavelet transform to analyze the behaviour of the complex systems from various fields started to be widely recognized and applied successfully during the last few decades. In this book some advances in wavelet theory and their applications in engineering, physics and technology are presented. The applications were carefully selected and grouped in five main sections - Signal Processing, Electrical Systems, Fault Diagnosis and Monitoring, Image Processing and Applications in Engineering. One of the key features of this book is that the wavelet concepts have been described from a point of view that is familiar to researchers from various branches of science and engineering. The content of the book is accessible to a large number of readers.
Vein Detection in Hyperspectral Images using a Combination of Dimensionality Reduction Methods and 3D-CNN
Hyperspectral imaging (HSI) captures detailed spectral and spatial information, making it a versatile tool across various domains. This paper presents a Convolutional Neural Network (CNN)\-based method for vein detection from hyperspectral images of human hands. The proposed method applies a dimensionality reduction technique to the input HSI to extract its features and reduce its dimensionality. A CNN model is then trained and used to identify the location of vein pixels in the input image. Three dimensionality reduction methods, namely Principal Component Analysis (PCA), Folded-PCA (FPCA), and Inter-band Correlation and Clustering using the K-means clustering method (ICC\_k-means), were used to reduce the dimensionality of the input image. A 3D-CNN model is then trained to detect the location of veins' pixels in the input HSI. A 3D-CNN model was trained on dimensionality reduced HSI data to accurately identify the location of veins' pixels in the input HSI. Experimental results were generated using the HyperVein dataset. The dataset images were randomly divided into training, validation, and test sets. Experimental results show that the proposed method using the ICC\_k-means dimensionality reduction technique achieves the highest accuracy, precision, recall, false positive rate, false negative rate, and receiver operating characteristics compared to when PCA and FPCA methods are used.
Impact of camera separation on performance of H.264/AVC-based stereoscopic video codec
The impact of camera separation on the performance of an H.264/AVC-based stereo-vision codec is investigated. To achieve this, the H.264/AVC software has been modified to support stereoscopic video coding. Results were generated using two sets of wide baseline convergent multi-view test videos: Breakdancers and Ballet. To generate a set of synchronised stereo-videos from the same scene with different inter-camera angles, all possible camera pairs are generated and classified according to their inter-camera angles. The resulting sets of stereo-videos are coded using a H.264/AVC-based stereo-vision and simulcast codecs. Results indicate that the stereo-vision codec outperforms the simulcast coding at lower inter-camera angles. Finally, a range of inter-camera angles for best use of either stereo-vision or simulcast coding is determined. © 2010 The Institution of Engineering and Technology.
Optimization of the lifetime of the battery within wireless sensor networks (WSNs) is challenging due to communication infrastructure. Subsequently, minimizing the amount of power required for data collection and processing to serve the intended purposes has become an open research problem. Conventional and compressive sensing-based (CS) query processing being the candidates to perform these tasks, require a comparative analysis in the current WSN application context. In this paper. Simulations have been carried out to compare the performance of conventional and compressive sensing-based (CS) query processing with respect to energy efficiency, sensing reliability and normalized estimation error within WSN. A significant reduction in the computational complexity reaching 70% is noticed using CS compared to conventional query processing algorithms. Moreover, it is observed that up to 90% sensing reliability can be achieved with CS compared to existing query processing. Hence, the reduction in computational complexity has not compromised the sensing reliability with an observed reduction in the normalized estimation error.
Fuzzy-based multiscale edge detection
A new fuzzy-based multiscale edge detection technique is presented. The proposed approach achieves optimal edge detection using the wavelet decomposition of the original signal followed by a novel fuzzy-based decision technique that is applied across the scales. Results indicate a significant improvement in locating edges compared to other multiscale approaches.
Adaptive joint subband vector quantisation codec for handheld videophone applications
A new wavelet-based video codec is presented, which joins vectors in the same subbands of different frames in a frame group and uses the properties of the human visual system. Perceptual weights are designed and used in the vector selection and quantisation of the wavelet coefficients. Results indicate a significant improvement in frame quality compared to JPEG2000 image sequence coder.
this paper presents a HEVC based multi-view video codec. The frames of the multi-view videos are interleaved to generate a monoscopic video sequence. The interleaving is conducted in a way to increase the exploitation of the temporal and inter-views correlations. The MV-HEVC standard codec is configured to work as a single layered codec, which functions as a monoscipic HEVC codec with AVC capabilities, and used to encode interleaved multi-view video frames. The performance of the codec is compared with the anchor standard MV-HEVC codec by coding the three standard multi-view video sequences: “Balloon”, “Kendo” and “Newspaper1”. Experimental results show the proposed codec out performs the anchor standard MV-HEVC codec in term of bitrate and PSNR.
An extension of the baseline Non-Intrusive Load Monitoring approach for energy disaggregation using temporal contextual information is presented in this paper. In detail the proposed approach uses a two-stage disaggregation methodology with appliance-specific temporal contextual information in order to capture time varying power consumption patterns in low frequency datasets. The proposed methodology was evaluated using datasets of different sampling frequency, number and type of appliances. When employing appliance-specific temporal contextual information an improvement of 1.5% up to 7.3% was observed. With the two-stage disaggregation architecture and using appliance-specific temporal contextual information the overall energy disaggregation accuracy was further improved across all evaluated datasets with the maximum observed improvement, in terms of absolute increase of accuracy, being equal to 6.8%, thus resulting in a maximum total energy disaggregation accuracy improvement equal to 10.0%.
A data-driven methodology to improve the energy disaggregation accuracy during Non-Intrusive Load Monitoring is proposed. In detail, the method is using a two-stage classification scheme, with the first stage consisting of classification models processing the aggregated signal in parallel and each of them producing a binary device detection score, and the second stage consisting of fusion regression models for estimating the power consumption for each of the electrical appliances. The accuracy of the proposed approach was tested on three datasets (ECO, REDD and iAWE), which are available online, using four different classifiers. The presented approach improves the estimation accuracy by up to 4.1% with respect to a basic energy disaggregation architecture, while the improvement on device level was up to 10.1%. Analysis on device level showed significant improvement of power consumption estimation accuracy especially for continuous and non-linear appliances across all evaluated datasets.
COVID-19 Detection from Chest X-Rays using Transfer Learning with Deep Convolutional Neural Networks
Several well-known pretrained deep CNN models were evaluated on their ability to detect COVID-19 from chest X-ray images, following a transfer learning approach. The retrained models were tested on two different datasets containing COVID-19, normal, viral and bacterial pneumonia cases. The best performing models among the evaluated ones were the MobileNet, DenseNet and ResNet after transfer learning retraining with top performing classification accuracies varying from 96.76% to 100%, thus indicating the potential of detecting the new coronavirus from X-ray images.
Color constancy is the capability to observe the true color of a scene from its image regardless of the scene’s illuminant. It is a significant part of the digital image processing pipeline and is utilized when the true color of an object is required. Most existing color constancy methods assume a uniform illuminant across the whole scene of the image, which is not always the case. Hence, their performances are influenced by the presence of multiple light sources. This paper presents a color constancy adjustment technique that uses the texture of the image pixels to select pixels with sufficient color variation to be used for image color correction. The proposed technique applies a histogram-based algorithm to determine the appropriate number of segments to efficiently split the image into its key color variation areas. The K-means++ algorithm is then used to divide the input image into the pre-determined number of segments. The proposed algorithm identifies pixels with sufficient color variation in each segment using the entropies of the pixels, which represent the segment’s texture. Then, the algorithm calculates the initial color constancy adjustment factors for each segment by applying an existing statistics-based color constancy algorithm on the selected pixels. Finally, the proposed method computes color adjustment factors per pixel within the image by fusing the initial color adjustment factors of all segments, which are regulated by the Euclidian distances of each pixel from the centers of gravity of the segments. Experimental results on benchmark single- and multiple-illuminant image datasets show that the images that are obtained using the proposed algorithm have significantly higher subjective and very competitive objective qualities compared to those that are obtained with the state-of-the-art techniques.
Performance improvement of the linear muskingum flood routing model using optimization algorithms and data assimilation approaches
The Muskingum model is one of the most widely used hydrological methods in flood routing, and calibrating its parameters is an ongoing research challenge. We optimized Muskingum model parameters to accurately simulate hourly output hydrographs of three flood-prone rivers in the Karun watershed, Iran. We evaluated model performance using the correlation coefficient (CC), the ratio of the root-mean-square error to the standard deviation of measured data (PSR), Nash–Sutcliffe efficiency (NSE), and index of agreement (d). The results show that the gray wolf optimization (GWO) algorithm, with CC = 0.99455, PSR = 0.155, NSE = 0.9757, and d = 0.9945, performed better in simulating the flood in the first study area. The Kalman filter (KF) improved these measures by + 0.00516, − 0.1246, + 0.02328, and + 0.00527, respectively. Our findings for the second flood show that the gravitational search algorithm (GSA), with CC = 0.9941, PSR = 0.1669, NSE = 0.9721, and d = 0.9921, performed better than all other algorithms. The Kalman filter enhanced each of the measures by + 0.00178, − 0.0175, + 0.0055 and + 0.0021, respectively. The gravitational search algorithm also performed best in the third flood, with CC = 0.9786, PSR = 0.2604, NSE = 0.9321, and d = 0.9848, and with improvements in accuracy using the Kalman filter of + 0.01081, − 0.0971, + 0.394, and + 0.0078, respectively. We recommend the use of GWO-KF for flood routing studies with flood events of high volumes and hydrograph base times, and use of GSA-KF for studies with flood events of high volumes and hydrograph base times.
Lung cancer is one of the leading causes of cancer related mortality. The early detection and classification of the cancers tissues will reduce the mortalities rate. The present research focus on the development of automated classification model for lung and colon cancers tissues based on the histopathology images. The present work encompasses a vision transformer (ViT) based model to enhance diagnostic accuracy of lung cancers tissues. The proposed model utilizes the self-attention mechanism of ViT to focus on essential features present in histopathologicals images. The proposed model has been validated using two different dataset namely LC25000 & IQ-OTH/NCCD with 25000 & 1096 images respectively. The performance of proposed model is compared with traditional convolutional neural network (CNN) model and it has been observed the based model outforms better in terms of accuracy which - 98.80% & 99.09% respectively for datasets.
Determining Low-Temperature Fracture Resistance Curves of Normal and Rubberized Asphalt Concrete Using Convolutional Neural Networks in Single-Edge Notched Beam Tests
Ensemble Based Hybrid Transfer Learning Approach for an Effective 2d Ear Recognition System
The early identification and categorization of brain tumor through MRI scans are pivotal for effective medical intervention. The present article encompasses a novel integrated framework that combines traditional machine learning and deep learning methods to categorize images of brain tumors. Utilizing the VGG-19 model pre-trained on ImageNet, we extract high level features from MRI images, which are further processed by a Long Short-Term Memory (LSTM) framework to extract spatial and temporal dependencies within the data. To manage the high dimensional feature space effectively, we employ Principal Component Analysis (PCA) for dimensionality reduction, followed by a Support Vector Machine (SVM) for the final classification task. We utilized a variety of data augmentation approach to enhance the capability of the architecture to generalize. Additionally, we fine-tuned the training parameters by employing the Adam optimizer along with early stopping and learning rate decay strategies. The model demonstrated exceptional precision, recall, and F1-score metrics, with an accuracy of 97.86%. This study not only validates the effectiveness of integrating CNNs, RNNs, and SVMs but also opens avenues for future research in medical image analysis using hybrid deep learning frameworks. Experimental outcomes demonstrate that the proposed model significantly improves the accuracy of brain tumor classification compared to previous methods, offering a promising tool for aiding radiologists in the rapid and accurate diagnosis of brain tumors.
Epilepsy is a neurological disorder in which normal brain activity is affected. Electroencephalography (EEG) is a gold standard for predicting epilepsy seizures. Manual inspection of EEG signals to detect various seizure phases is a major challenge therefore, need for some mechanism to automate seizure prediction to ease the work of clinicians is required. Moreover, Epilepsy patients suffer difficulties in social gatherings as the seizure is unpredictable which can cause anxiety, fear, etc in them. To overcome these challenges, automatic seizure prediction is vital. The present work encompasses the channel selection and data augmentation methods to create a system that automatically classifies two phases of seizure-ictal and preictal using a 1D Convolutional Neural Network (CNN) model. Instead of using all channels of EEG signals, a subset of channels is used in this work to analyze its effect on performance. Data augmentation is used to increase the datasets which got limited due to channel selection. The model achieves the best performance with the help of the 1D CNN model. It achieves an accuracy of 99.62%, specificity of 99.76%, and sensitivity of 99.70% on the CHB-MIT dataset. This work shows that a subset of channels with greater importance can improve the robustness of the seizure prediction system. This in turn saves the time to set up the electrodes on the scalp of the patients. Various data augmentation schemes can be used to compensate for the limited dataset. The usage of models like 1D CNN is suitable for designing low-power seizure prediction systems.
Today in every field, there is a desire to get maximum profit with the least investment. Efficiency and utilization with minimum investment are key requirements. And here comes the idea of optimization. Optimization is a process of choosing the best available options from given alternatives which, as a result, gives us the best solution. For example, in the design of a bridge, civil engineers must make many decisions in various stages of construction. So, optimization is nothing but making the best feasible decision. The goal of such decisions is either to maximize the profit or to minimize the effort. It is a crucial tool for analyzing systems. Both maximizing and minimizing are two categories of optimization problems. Optimization methods are applied to many problems in various fields to handle practical problems. It is not limited to some fields only; the idea is used widely in all fields. With the advancement in computing techniques, optimization has become an important part of problem-solving. Many optimization techniques have been proposed in the last few years by researchers. However, despite many optimization algorithms, no method is suitable for all optimization problems. The most appropriate method is selected based on the specific problem.
Batik is a traditional textile art form native to Southeast Asia, especially prominent in Malaysia and Indonesia, where unique patterns reflect significant cultural value. The intricate designs of batik, often embodying floral, geometric, and symbolic elements, make automated classification challenging and time intensive. This study presents a method for classifying Malaysian and Indonesian batik patterns using deep learning models. A curated dataset of 1,825 batik images was compiled, consisting of 949 Indonesian batik images and 876 Malaysian batik images. Three popular Convolutional Neural Network (CNN) architectures: MobileNet v2, YOLO-v8, and LeNet-5 were evaluated based on classification accuracy, loss, and training efficiency. Results show that YOLO-v8 achieved the highest accuracy at 98.80%, followed by MobileNet v2 with 97.79%, and LeNet-5 with 92.94%. These findings indicate that CNN models can effectively distinguish between Malaysian and Indonesian batik designs, offering valuable applications in cultural preservation and industry documentation. Future work could focus on refining these models for real-time use and expanding the dataset to capture additional regional variations in batik design.
Person identification using ear images has gained significant attention recently. Transfer learning provides an effective platform for image classification, utilizing CNNs like AlexNet, ResNet, VGG16, and VGG19, which are fine-tuned for specific applications. Combining transfer learning with support vector machines (SVM) enhances people recognition via ear images. This paper integrates a hybrid transfer learning model with an ensemble technique to improve recognition accuracy. We use pre-trained CNN models, VGG16 and VGG19, for feature extraction and replace the fully connected layer with an SVM classifier. Using the SoftMax activation function, each model generates a probabilistic output, which is averaged for classification. The proposed ensemble model was validated on two datasets with variations in pose, illumination, and rotation. Simulation results show that the ensemble-based transfer learning approach outperforms its two anchor models and competes with state-of-the-art ear recognition techniques.
Many natural phenomena suggest that biological algorithms are embedded in an organism's genome and expressed in cognition and behaviour through complex biological mechanisms. This review discusses these phenomena and proposes methods to explore them, focusing on algorithms embedded in neural systems. The application scope of biological algorithms is not only limited to biology and medicine but also to various engineering fields. The mathematical problems behind biological algorithms also prompt questions about the explain-able aspects of artificial intelligence models. We discovered that computational tools can indeed be utilized to recover these algorithms, leading us to conduct some preliminary experiments using existing computational methods. Despite this progress, the current tools have limitations. To overcome these challenges, it will be necessary to design targeted experiments aimed at observing the dynamics of the neuronal gene expression system. In light of this, we have highlighted the theoretical aspects and suggest potential research directions that we hope will advance this field in the future.
The utilization of land is a major challenge in current era. Land cover refers to the surface cover on the ground. It may be vegetation, grass, water bodies, bare land or any other use. Land use refers to the purpose the land serves, for example, residential, wildlife habitat, or agriculture. The databases of land cover and land use becomes outdated quickly, hence, an automatic update process is required. The present approach to determine land cover and to classify land use objects based on convolution neural networks (CNN) and to study the effects on changing parameter on the results. The input data for proposed approach are aerial images from Sentinel-2 satellite images. Land cover and land use for each image has been determine with the use of CNN. The present work also describes the effect of changing parameters on our results and output generated in each case. Comparisons of our results with different existing algorithms have also been analyzed. Experiments show that overall accuracy of the proposed approach is 93-95% for land cover and land use. The classification of land cover and land use has a positive contribution toward the utilization of land by humans.
Smart meters are used to measure the energy consumption of households. Specifically, within the energy consumption task, a smart meter must be used for load forecasting, the reduction in consumer bills as well as the reduction in grid distortions. Smart meters can be used to disaggregate the energy consumption at the device level. In this paper, we investigated the potential of identifying the multimedia content played by a TV or monitor device using the central house’s smart meter measuring the aggregated energy consumption from all working appliances of the household. The proposed architecture was based on the elastic matching of aggregated energy signal frames with 20 reference TV channel signals. Different elastic matching algorithms, which use symmetric distance measures, were used with the best achieved video content identification accuracy of 93.6% using the MVM algorithm.
An Approach for Denoising of Contaminated Signal Using Fractional Order Differentiator
Calculus of integer order is more than a part of our daily life. As the order deviates to the fractional realm, things become much more interesting. The current work proposes a novel method for denoising of contaminated signals by use of fractional order derivative-based differentiator. The Riemann–Liouville definition has been used for fractionalizing the differentiator. The defined methodology produces a fractional order differentiator to treat different nature of signals, and experimentally establishes its findings. The proposed method has been also compared with different techniques available in the literature. The results that have been obtained through the experiments seem promising.
Support Vector Machine (SVM) is a supervised machine learning algorithm, which is used for robust and accurate classification. Despite its advantages, its classification speed deteriorates due to its large number of support vectors when dealing with large scale problems and dependency of its performance on its kernel parameter. This paper presents a kernel parameter optimization algorithm for Support Vector Machine (SVM) based on Sliding Mode Control algorithm in a closed-loop manner. The proposed method defines an error equation and a sliding surface, iteratively updates the Radial Basis Function (RBF) kernel parameter or the 2-degree polynomial kernel parameters, forcing SVM training error to converge below a threshold value. Due to the closed-loop nature of the proposed algorithm, key features such as robustness to uncertainty and fast convergence can be obtained. To assess the performance of the proposed technique, ten standard benchmark databases covering a range of applications were used. The proposed method and the state-of-the-art techniques were then used to classify the data. Experimental results show the proposed method is significantly faster and more accurate than the anchor SVM technique and some of the most recent methods. These achievements are due to the closed-loop nature of the proposed algorithm, which significantly has reduced the data dependency of the proposed method.
This paper presents a Chainlet based Ear Recognition algorithm using Multi-Banding and Support Vector Machine (CERMB-SVM). The proposed method divides the gray input image into a number of bands based on the intensity of its pixels, resembling a hyperspectral image. It then applies Canny edge detection on each resulting normalized band, extracting edges that represent the ear pattern in each band. The resulting binary edge maps are then flattened, generating a single binary edge map. This edge map is then split into non-overlapping cells and the Freeman chain code for each group of connected edges within each cell is calculated. A histogram of each group of contiguous four cells is calculated, and the results histograms are then normalized and concatenated to form a chainlet for the input image. The resulting chainlet histogram vectors of the images of the dataset are then used for training and testing a pairwise Support Vector Machine (SVM). Experimental results on images of two benchmark ear image datasets show that the proposed CERMB-SVM technique significantly outperforms Principal Component Analysis based methods. In addition, CERMB-SVM also outperforms its anchor chainlet method and state of the art learning based ear recognition techniques.
Principal Component Analysis (PCA) has been successfully applied to many applications, including ear recognition. This paper presents a Two Dimensional Multi-Band PCA (2D-MBPCA) method, inspired by PCA based techniques for multispectral and hyperspectral images, which have demonstrated signi cantly higher performance to that of standard PCA. The proposed method divides the input image into a number of images based on the intensity of the pixels. Three di erent methods are used to calculate the pixel intensity boundaries, called: equal size, histogram, and greedy hill climbing based techniques. Conventional PCA is then applied on the resulting images to extract their eigenvectors, which are used as features. The optimal number of bands was determined using the intersection of number of features and total eigenvector energy. Experimental results on two benchmark ear image datasets demonstrate that the proposed 2D-MBPCA technique signi cantly outperforms single image PCA by up to 56.41% and the eigenfaces technique by up to 29.62% with respect to matching accuracy on images from two benchmark datasets. Furthermore, it gives very competitive results to those of learning based techniques at a fraction of their computational cost and without a need for training.
Approximate computing is a promising approach for reducing power consumption and design complexity in applications that accuracy is not a crucial factor. Approximate multipliers are commonly used in error-tolerant applications. This paper presents three approximate 4:2 compressors and two approximate multiplier designs, aiming at reducing the area and power consumption, while maintaining acceptable accuracy. The paper seeks to develop approximate compressors that align positive and negative approximations for input patterns that have the same probability. Additionally, the proposed compressors are utilized to construct approximate multipliers for distinct columns of partial products based on the input probabilities of the two compressors in adjacent columns. The proposed approximate multipliers are synthesized using the 28nm technology. Compared to the exact multiplier, the first proposed multiplier improves power×delay and area×power by 91% and 86%, respectively, while the second proposed multiplier improves the two parameters by 90% and 84%, respectively. The performance of the proposed approximate methods was assessed and compared with the existing methods for image multiplication, sharpening, smoothing and edge detection. Also, the performance of the proposed multipliers in the hardware implementation of the neural network was investigated, and the simulation results indicate that the proposed multipliers have appropriate accuracy in these applications.
Convolutional Neural Networks (CNNs) have emerged as a popular choice of researchers for their robust feature extraction and information mining capability. In the last decades, CNNs have depicted impressive performance on various applications of computer vision tasks like object detection, image segmentation, and image classification. As a consequence, the ear-based recognition system has not gained many benefits from deep learning and CNN-based applications and is still lacking behind due to the availability of sufficient data and varying conditions of captured sample images. In this paper, transfer learning techniques have been applied to the well-known convolutional neural network model VGG16 integrated with the support vector machine(SVM) that acts as a hybrid algorithm for recognizing the person using their ear images. The proposed model is validated on an ear dataset containing a total of 2600 images with variability in terms of pose, rotation, and illumination changes. The proposed model is able to classify ear images with the highest recognition accuracy of 98.72%. To show the effectiveness of the proposed model, comparative studies of the proposed model with other existing methods have been reported in the literature.
Development of Image Processing Techniques in Crack Detection and Analysis
Inspection of aircraft skin is required as per the Corrosion Prevention and Control Program (CPCP) to ensure aircraft structural integrity. Human visual inspection is the most widely used technique in aircraft surface inspection, according to the CPCP. Scheduled inspections and regular maintenance of an aircraft through conventional methods constitute tedious and lengthy procedures. Often the visual inspections lead to subjective judgement and do not constitute repeatability. Many automated vision-based aircraft skin inspection systems have been designed to provide a safe, quick, and accurate visual assessment over the past years. This paper presents a section of research investigating defect detecting and accurately locating the outer body of an aircraft using an Unmanned Aerial Vehicle (UAV) to capture images and digital image processing techniques to locate possible cracks. The inspection system is used to initially detect locations of cracks (defects) on an aircraft’s outer skin and the detected crack is further investigated using thermal and ultrasound imaging methods. The scope of this paper includes a review of the design and development of a series of advanced dedicated image processing algorithms suitable for applying digital image processing on images captured from the outer surface of a typical aircraft fuselage.
The application of the Support Vector Machine (SVM) classification algorithm to large-scale datasets is limited due to its use of a large number of support vectors and dependency of its performance on its kernel parameter. In this paper, SVM is redefined as a control system and Iterative Learning Control (ILC) method is used to optimize SVM’s kernel parameter. The ILC technique first defines an error equation and then iteratively updates the kernel function and its regularization parameter using the training error and the previous state of the system. The closed-loop structure of the proposed algorithm increases the robustness of the technique to uncertainty and improves its convergence speed. Experimental results were generated using nine standard benchmark datasets covering a wide range of applications. Experimental results show that the proposed method generates superior or very competitive results in term of accuracy than those of classical and stateof-the-art SVM-based techniques while using a significantly smaller number of support vectors.
Identification of TV channel watching from smart meter data using energy disaggregation
Smart meters are used to measure the energy consumption of households. Specifically, within the energy consumption task, a smart meter must be used for load forecasting, the reduction in consumer bills as well as the reduction in grid distortions. Smart meters can be used to disaggregate the energy consumption at the device level. In this paper, we investigated the potential of identifying the multimedia content played by a TV or monitor device using the central house’s smart meter measuring the aggregated energy consumption from all working appliances of the household. The proposed architecture was based on the elastic matching of aggregated energy signal frames with 20 reference TV channel signals. Different elastic matching algorithms, which use symmetric distance measures, were used with the best achieved video content identification accuracy of 93.6% using the MVM algorithm.
This paper presents a fault detection method in three-phase induction motors using Wavelet Packet Transform (WPT). The proposed algorithm takes a frame of samples from the three-phase supply current of an induction motor. The three phase current samples are then combined to generate a single current signal by computing the Root Mean Square (RMS) value of the three phase current samples at each time stamp. The resulting current samples are then divided into windows of 64 samples. Each resulting window of samples is then processed separately. The proposed algorithm uses two methods to create window samples, which are called non-overlapping window samples and moving/overlapping window samples. Non-overlapping window samples are created by simply dividing the current samples into windows of 64 samples, while the moving window samples are generated by taking the first 64 current samples, and then the consequent moving window samples are generated by moving the window across the current samples by one sample each time. The new window of samples consists of the last 63 samples of the previous window and one new sample. The overlapping method reduces the fault detection time to a single sample accuracy. However, it is computationally more expensive than the non-overlapping method and requires more computer memory. The resulting window samples are separately processed as follows: The proposed algorithm performs two level WPT on each resulting window samples, dividing its coefficients into its four wavelet subbands. Information in wavelet high frequency subbands is then used for fault detection and activating the trip signal to disconnect the motor from the power supply. The proposed algorithm was first implemented in the MATLAB platform, and the Entropy power Energy (EE) of the high frequency WPT subbands’ coefficients was used to determine the condition of the motor. If the induction motor is faulty, the algorithm proceeds to identify the type of the fault. An empirical setup of the proposed system was then implemented, and the proposed algorithm condition was tested under real, where different faults were practically induced to the induction motor. Experimental results confirmed the effectiveness of the proposed technique. To generalize the proposed method, the experiment was repeated on different types of induction motors with different working ages and with different power ratings. Experimental results show that the capability of the proposed method is independent of the types of motors used and their ages.
A Deep Learning-Based Approach to Detect Correct Suryanamaskara Pose
We present a technique to analyse Suryanamaskar poses using keypoint estimation and statistical analysis. The proposed approach uses a trained model based on COCO keypoint detection dataset and uses it to determine keypoints in yoga poses. Our work uses the keypoint detection to suggest a self yoga correction system. A novel dataset, Surya-yoga, containing 10000 Suryanamaskara poses has been generated and made publicly available. The model presented in this paper performed better on the COCO dataset and combined COCO and Surya-yoga dataset when tested using part afnity felds. The work also presents an analytical method of distinguishing diferent Suryanamaskar poses alongside deep learning methods.
Conceptual design of an autonomous unmanned aerial vehicle
A conceptual design of a medium size unmanned aerial vehicle (UAV) is reported in the present study. This study is part of a bigger research investigating the ability of an advanced UAV to autonomously traverse over an aircraft and accurately detect locations of defects on the aircraft skin to assist in further investigation of subsurface defects utilizing thermal and ultrasound techniques. The aim of the research study is to minimize the inspection time of a commercial aircraft to the order of an hour. This paper focuses on the aerodynamics and structural design of a UAV capable of carrying the weights of optical, thermal, and ultrasound cameras as required. The design requirements include the ability of the UAV to traverse the aircraft's outer skin autonomously sending both its GPS location and captured images to a secure data drive within the vicinity of the aircraft hangar and the ability to wirelessly charge its battery and thereby improve its hovering duration, i.e., its endurance. To achieve the preliminary design, the aerodynamic coefficients are estimated using computational fluid dynamics analysis, as well as the smart design of fan shrouds to minimize
Ear biometrics has been found to be a good and reliable technique for human recognition. Initially ear biometrics could not gain popularity because there were doubts about its uniqueness. But, it started to gain momentum after a theory which came into existence and which said that it was very unlikely for any two years to be completely identical in all respects. The implemented methodology consists of steps such as pre-processing, feature extraction and matching based on the selected features. Our technique determines the extent to which these features support matching. The proposed work has been carried out on on a dataset containing 60 images for analyzing their features and matching of the source image with the dataset images. The results have been obtained on the basis of images correctly classified. The system accuracy telling us the extent to which matching could be performed on the basis of selected features.
Forward Error Correction (FEC) is a commonly adopted mechanism to mitigate packet loss/bit error during real-time communication. An adaptive, Fuzzy based FEC algorithm to provide a robust video quality metric for multimedia transmission over wireless networks has been proposed to optimize the redundancy of the generated code words from a Reed-Solomon encoder and to save the bandwidth of the network channel. The scheme is based on probability estimations derived from the data loss rates related to the recovery mechanism at the client end. By applying the adaptive FEC, the server uses the reports to predict the next network loss rate using a curve-fitting technique to generate the optimized number of redundant packets to meet specific residual error rates at the client end. Simulation results in the cellular system show that the video quality is massively adapted to the optimized FEC codes based on the probability of packet loss and packet correlation in a wireless environment.
Support Vector Machine (SVM) is a learning-based algorithm, which is widely used for classification in many applications. Despite its advantages, its application to large scale datasets is limited due to its use of large number of support vectors and dependency of its performance on its kernel parameter. This paper presents a Sliding Mode Control based Support Vector Machine Radial Basis Function’s kernel parameter optimization (SMC-SVM-RBF) method, inspired by sliding mode closed loop control theory, which has demonstrated significantly higher performance to that of the standard closed loop control technique. The proposed method first defines an error equation and a sliding surface and then iteratively updates the RBF’s kernel parameter based on the sliding mode control theory, forcing SVM training error to converge below a predefined threshold value. The closed loop nature of the proposed algorithm increases the robustness of the technique to uncertainty and improves its convergence speed. Experimental results were generated using nine standard benchmark datasets covering wide range of applications. Results show the proposed SMC-SVM-RBF method is significantly faster than those of classical SVM based techniques. Moreover, it generates more accurate results than most of the state of the art SVM based methods.
13 COVID-19 detection from chest X-rays using transfer learning with deep convolutional neural networks
Several well-known pretrained deep convolutional neural network models were evaluated on their ability to detect COVID-19 from chest X-ray images, following a transfer learning approach. The retrained models were tested on two different datasets containing COVID-19, normal, viral, and bacterial pneumonia cases. The best performing models among the evaluated ones were the MobileNet, DenseNet, and ResNet after transfer learning retraining with top performing classification accuracies varying from 96.76% to 100%, thus indicating the potential of detecting the new coronavirus from X-ray images.
13 Color constancy adjustment techniques
This chapter presents an overview on color constancy adjustment techniques. The concept of color constancy within digital images is first introduced and then some of the recent color correction methods are discussed. Some publicly available benchmark standard image datasets, which are used by the researchers to assess the performance of the color correction methods are introduced. These datasets contain both real and syntactical images of scenes illuminated by a single or multiple light source/s. Color constancy quality assessment measures, which are widely used in the literature, are also detailed. Finally, the performance of different color correction methods on images of different benchmark image datasets is assessed and compared. The chapter demonstrates that the learning-based approaches outperform the statistical based algorithms at significantly higher computation costs. Moreover, their performances are very data-dependent, while recent statistical-based methods have slightly lower performance to those of the learning-based algorithms at significantly lower computation cost and data dependency.
14 Hyperspectral imaging and its applications for vein detection: a review
With advances in technology, hyperspectral imaging has become an emerging area of research due to its numerous advantages over conventional imaging techniques. HyperSpectral (HS) cameras generate images of high spectral as well as spatial resolution. Hence, HS images carry much more information from the scene than the conventional red, green and blue (RGB) images. This inspired researchers to use HS technologies for many different applications ranging from crime investigations to crop monitoring. It is important to accurately detect veins during surgical treatments, but this often turns out to be difficult. Wrongly locating veins or anatomical variations could result in accidental injury to blood vessels. This could lead to a longer operation time or even create serious complications. Furthermore, for majority of medical procedures, it is necessary to accurately define the location of veins. Over the past years, various methods including near infrared (NIR) and multi-spectral image processing-based methods have been proposed to help with detecting and accurately locating the veins. However, the performance of these methods is limited and demand for more accurate and convenient methods are increasing. HS images are two-dimensional (2D) representation of the scene at many light spectral. This brings the challenge of processing high dimensional data, which require significant processing power to deal with them. Various methods such as principal component analysis (PCA), Moving Window-PCA and Folded-PCA, which are widely used to reduce the dimensionality of HS image data, are reviewed in this book chapter. Conventional RGB, HS, NIR and multispectral images are studied and then HS imaging systems are introduced. Different applications of HS imaging are reviewed and their potential for vein detection is highlighted. Different techniques for reducing high dimensional data are discussed, and finally, different vein detection methods and some of the existing vein benchmark datasets are also introduced.
Identification of TV Channel Watching from Smart Meter Data Using Energy Disaggregation
Smart meters are used to measure the energy consumption of households. Specifically, within the energy consumption task smart meter have been used for load forecasting, reduction of consumer bills as well as reduction of grid distortions. Except energy consumption smart meters can be used to disaggregate energy consumption on device level. In this paper we investigate the potential of identifying the multimedia content played by a TV or monitor device using the central house's smart meter measuring the aggregated energy consumption from all working appliances of the household. The proposed architecture is based on elastic matching of aggregated energy signal frames with 20 reference TV channel signals. Different elastic matching algorithms were used with the best achieved video content identification accuracy being 93.6% using the MVM algorithm.
Subcutaneous vein detection is critical in medical procedures like venipuncture and catheter placement. This PhD thesis introduces comprehensive research on Hyperspectral Imaging (HSI)-based vein detec tion. A Hyperspectral (HS) image dataset, called HyperVein, for vein detection is presented. The effectiveness of several state-of-the-art dimensionality reduction techniques, including Principal Com ponent Analysis (PCA), Folded Principal Component Analysis (FPCA), and Ward’s Linkage Strategy using Mutual Information (WaLuMI), along with Support Vector Machine (SVM) classification for vein detection, is investigated using the HyperVein dataset. Results show that FPCA-based methods generate more accurate results than WaLuMI and PCA-based methods. An effective dimensionality reduction method for vein detection from HS images, called Inter Band Correlation and Clustering (ICC), was developed. The proposed ICC method normalizes each spectral band, computes a correlation matrix across normalized HS images, and uses a subset of the correlation matrix’s eigenvectors to create a feature space by projecting the input HS image data onto the eigenvectors. Clustering is then applied to the resulting feature space coefficients to map them into a more effective feature space, generating the dimensionality reduced representation of the input HS image. The reduced HS image and SVM classification algorithm were used for vein detection. Experimental results show that the proposed method outperforms existing methods. A Convolutional Neural Network (CNN)-based method for vein detection in HS images of human hands is proposed. It applies PCA, FPCA, WaLuMI and ICC_k-means and ICC_spectral dimensional ity reduction techniques to extract HS image features. A 3D-CNN model was trained on dimensionality reduced HSI data to accurately identify vein pixel locations. Experimental results demonstrate that the ICC_k-means method outperforms PCA and FPCA.
A Novel Multiresolution Perceptual and Statistically Based Image Coding Scheme
In this paper a new hybrid multiresolution Human Visual System and statistically based image coding scheme is presented. It decorrelates the input image into a number of subbands using a lifting based wavelet transform and employs a novel statistically based coding algorithm to code the coefficients in the detail subbands. Perceptual weights are applied to regulate the threshold value of each detail subband that is required in the coding process. The baseband coefficients are losslessly coded. To evaluate the performance of the coding scheme, it was applied to a number of test images with and without perceptual weights. The results indicate significant improvement in both subjective and objective quality of the reconstructed images when the perceptual weights are employed. The performance of the proposed technique was also compared to JPEG and JPEG2000. The results show that the proposed computationally efficient coding scheme outperforms both coding standards at low compression ratios, while offering satisfactory performance at higher compression ratios. © IEEE.
Colour constancy is the ability to measure the colour of objects independent of the light source, while colour casting is the presence of unwanted colour in digital images. Colour casting significantly affects the performance of image processing algorithms such as image segmentation and object recognition. The presence of large uniform background within the image considerably deteriorates the performance of many state of the art colour constancy algorithms. This paper presents a colour constancy method using the sub-blocks of the image to alleviate the effect of large uniform colour area of the scene. The proposed method divides the input image into a number of non-overlapping blocks, and Average Absolute Difference (AAD) value of each block colour component is calculated. The blocks with AAD greater than threshold values, which are empirically determined for each colour component, are considered to have sufficient colour information. The selected blocks are then used to determine the scaling factors to achieve achromatic values for the input image colour components. Comparing the performance of the proposed technique with the state of the art methods using images from three datasets shows that the proposed method outperforms the state of the art techniques in the presence of large uniform colour patches.
Colour cast is the ambient presence of unwanted colour in digital images due to the source illuminant while colour constancy is the ability to perceive colors of object, invariant to the colour of the source illuminant. Existing statistic based colour constancy methods use whole image pixel values for illuminant estimation. However, not every region of an image contains reliable colour information. Therefore, the presence of large uniform colour patches within the image considerably deteriorates the performance of colour constancy algorithms. This paper presents an algorithm to alleviate the biasing effect of the uniform colour patches of the colour constancy compensation techniques. It employs the k-means clustering algorithm to segment image areas according to their colour information. The Average Absolute Difference (AAD) of each colour component of the segment is calculated and used to identify and exclude segments with uniform colour information from being used for colour constancy adjustments. Experimental results were generated using three benchmark datasets and compared with the state of the art techniques. Results show the proposed technique outperforms existing techniques in the presence of the uniform colour patches and similar to Grey World method in the absent o uniform colour patches.
Colour constancy refers to the task of revealing the true colour of an object despite ambient presence of intrinsic illuminant. The performance of most of the existing colour constancy algorithms are deteriorated when image contains a big patch of uniform colour. This paper presents a Max-RGB based colour constancy adjustment method using the sub-blocks of the image to significantly reduce the effect of the large uniform colour area of the scene on colour constancy adjustment of the image. The proposed method divides the input image into a number of non-overlapping blocks and computes the Average Absolute Difference (AAD) value of each block’s colour component. The blocks with the AADs greater than threshold values are considered having sufficient colour variation to be used for colour constancy adjustment. The Max-RGB algorithm is then applied to the selected blocks’ pixels to calculate colour constancy scaling factors for the whole image. Evaluations of the performance of the proposed method on images of three benchmark datasets show that the proposed method outperforms the state of the art techniques in the presence of large uniform colour patches.
Face recognition has become a field of interest in many applications such as security and entertainments. In surveillance system, the quality of the recoded footage is sometimes insufficient due to the distance and angle of the camera from the scene. This causes the object of interest, e.g. the face of a person in the scene to be of low resolution, which increases the difficulty in recognition process. Image resolution enhancement is a potential solution for enlarging low-resolution images for real time face recognition. An enlarged image is then compared to available database of images to either identify or verify the individuals. However, the optimal performance of face recognition techniques when various image enlargement methods have been applied to them has not been investigated. In this research, the performance of PCA based face recognition method, with the three most well-known image enlargement techniques (Nearest Neighbour, Bilinear, Bicubic) is investigated. First, an input image is down sampled to six different resolutions. The down-sampled image is then enlarged to its original size using the three named image enlargement techniques. The enlarged image is then input to a PCA face recognition system for the recognition process. The simulation results using images from the SCFace database show that PCA based face recognition illustrates superior results when input images enlarged using Nearest Neighbour technique, while the performance of Bicubic and Bilinear techniques is slightly lower than Nearest Neighbour method.
Hyperspectral imaging plays a pivotal role in various fields, particularly in precise human vein detection within medical diagnostics. However, dealing with the large-scale hyperspectral (HS) data presents challenges. To address this, dimensionality reduction techniques are commonly employed to enhance data manageability during processing. The introduced novel dimensionality reduction approach, Correlation-PCA Fusion and Clustering (CoPCA-Clus), is rigorously compared with established techniques, namely Principal Component Analysis (PCA) and Folded PCA (FPCA), specifically for vein detection in HS images. Results demonstrate that CoPCA-Clus surpasses PCA and FPCA, exhibiting superior performance across accuracy, precision, recall, false positive rate, and false negative rate. Additionally, performance metrics are derived for each technique, and classification images are generated. Subsequently, morphological operations enhance the visualization of vein regions within the HS image.
HyperVein: A Dataset for Human Vein Detection from Hyperspectral Images
Hyperspectral (HS) imaging plays a pivotal role in various fields, including medical diagnostics, where precise human vein detection is crucial. Hyperspectral image data are very large and can cause computational complexities. Dimensionality reduction techniques are often employed to streamline HS image data processing. This paper investigates the effectiveness of three dimensionality reduction techniques, namely: Principal Component Analysis (PCA), Folded PCA (FPCA), and Ward’s Linkage Strategy using Mutual Information (WaLuMI) for vein detection using HS images. A HS image dataset, encompassing left and right-hand images captured from 100 subjects with varying skin tones was created and annotated using anatomical data to represent vein and non-vein areas within the images. To generate experimental results, the HS image dataset was divided into train and test datasets. Optimum performing parameters for each of the dimensionality reduction techniques in conjunction with The Support Vector Machine binary classification were determined using the Training dataset. The performance of the three dimensionality reduction based vein detection methods was then assessed and compared using the test image dataset. Results show that the FPCA-based method outperforms the other two methods in terms of accuracy. For visualization purposes, the classification prediction image for each technique is post-processed using morphological operators, and results show the significant potential of HS imaging in vein detection.
Hyperspectral imaging has become crucial in various domains, especially for the accurate detection of human veins in medical diagnostics, though managing the extensive data from hyperspectral (HS) images remains a challenge. To improve data handling during analysis, dimensionality reduction methods are frequently utilized. This paper presents a dimensionality reduction method for HS images using HS image inter-band cross-correlation and the K-means clustering algorithm. The proposed method computes inter-band correlations across all bands of the input HS image, which form a 2D correlation matrix. Eigen-decomposition is applied to the resulting matrix, extracting its eigenvectors and eigenvalues. The k-mean clustering algorithm is then applied to a selection of eigenvectors representing the largest eigenvalues, splitting the eigenvectors into several clusters. The reduced HS image is generated by averaging each cluster's image bands. The proposed dimensionality reduction method together with the Support Vector Machine (SVM) classifier was then used for vein detection in HS images. The HyperVein image dataset was used to generate experimental results. Experimental results were generated for the proposed method and Principal Component Analysis (PCA) and Folded PCA (FPCA). Results show the proposed method outperforms PCA and FPCA in most performance metrics.
This paper investigates the use of machine learning techniques on hyperspectral images of pistachios to detect and classify different levels of aflatoxin contamination. Aflatoxins are toxic compounds produced by moulds, posing health risks to consumers. Current detection methods are invasive and contribute to food waste. This paper explores the feasibility of a non-invasive method using hyperspectral imaging and machine learning to classify aflatoxin levels accurately, potentially reducing waste and enhancing food safety. Hyperspectral imaging with machine learning has shown promise in food quality control. The paper evaluates models including Dimensionality Reduction with K-Means Clustering, Residual Networks (ResNets), Variational Autoencoders (VAEs), and Deep Convolutional Generative Adversarial Networks (DCGANs). Using a dataset from Leeds Beckett University with 300 hyperspectral images, covering three aflatoxin levels (<8 ppn, >160 ppn, and >300 ppn), key wavelengths were identified to indicate contamination presence. Dimensionality Reduction with K-Means achieved 84.38% accuracy, while a ResNet model using the 866.21 nm wavelength reached 96.67%. VAE and DCGAN models, though promising, were constrained by dataset size. The findings highlight the potential for machine learning-based hyperspectral imaging in pistachio quality control, and future research should focus on expanding datasets and refining models for industry application.
Hyperspectral Imaging and its Applications for Vein Detection – a Review
With advances in technology, hyperspectral imaging has become an emerging area of research, due to its numerous advantages over conventional imaging techniques. HyperSpectral (HS) cameras generate images of high spectral as well as spatial resolution. Hence, HS images carry much more information from the scene than the conventional Red, Green and Blue (RGB) images. This inspired researchers to use HS technologies for many different applications ranging from crime investigations to crop monitoring. It is important to accurately detect veins during surgical treatments, but this often turns out difficult. Wrongly locating veins or anatomical variations could result in accidental injury to blood vessels. This could lead to a longer operation time or even create serious complications. Furthermore, for majority of medical procedures, it is necessary to accurately define the location of veins. Over the past years, various methods including Near InfraRed (NIR) and Multi-Spectral image processing-based methods have been proposed to help with detecting and accurately locating the veins. However, the performance of these methods is limited and demand for more accurate and convenient methods are increasing. Hyperspectral images are 2D representation of the scene at many light spectral. This brings the challenge of processing high dimensional data, which require significant processing power to deal with them. Various methods such as Principal Component Analysis (PCA), Moving Window-PCA, and Folded-PCA, which are widely used to reduce the dimensionality of HS image data, are reviewed in this book chapter. Conventional RGB, hyperspectral, near infrared, and multispectral images are studied and then hyperspectral imaging systems are introduced. Different applications of hyperspectral imaging are reviewed and their potential for vein detection is highlighted. Different techniques for reducing high dimensional data are discussed, and finally, different vein detection methods and some of the existing vein benchmark datasets are also introduced.
Progressive multiresolution perceptual and statistically based image codec
This paper presents a progressive multiresolution human visual system and statistically based image-coding scheme. The proposed coding scheme decorrelates the input image into a number of subbands using a lifting based wavelet transform and employs a novel statistically-based coding algorithm to code the coefficients in the detail subbands. Perceptual weights are applied to regulate the threshold value of each detail subband that is required in the coding process. The baseband coefficients are losslessly coded. The coded subbands are used for progressive image transmission. To evaluate the performance of the coding scheme, it was applied to a number of test images with and without perceptual weights. The results indicate significant improvement in both subjective and objective quality of the reconstructed images when the perceptual weights are employed. The performance of the new progressive image codec was also compared to JPEG and JPEG2000. The results show that the proposed computationally efficient coding scheme outperforms both coding standards at low compression ratios, while offering satisfactory performance at higher compression ratios. The application of the codec to progressive image transmission is also investigated on a series of test images.
A novel statistical and DCT based image encoder
This paper presents a novel statistical and discrete cosine transform (DCT) based image-coding scheme. The proposed coding scheme divides the input image into a number of non-overlapping pixel blocks. The coefficients in each block are then decorrelated into their spatial frequencies using a discrete cosine transform. Coefficients with the same spatial frequency at different blocks are put together to generate a number of matrices, where each matrix contains coefficients of a particular spatial frequency. The matrix containing DC coefficients is losslessly coded to preserve visually important information. Matrices, which consist of high frequency coefficients, are coded using a novel statistically based coding algorithm developed in this paper. Perceptual weights are used to regulate the threshold value required in the coding process of the high frequency matrices. The proposed coding scheme and JPEG were applied to Lena, Elaine and House, three test images, and results show that the proposed coding scheme outperforms JPEG subjectively and objectively at low compression ratios. Results also indicate that the decoded images using the proposed codec have superior subjective quality at high compression ratios while JPEG suffers from blocking artifacts at high compression ratios.
Compressive sampling and wavelet-based multi-view image compression scheme
A multi-view image codec using a disparity compensated lifting based wavelet transform and 'compressive sampling (CS)' is presented. The input images are de-correlated into their sub-bands, using disparity compensated view filtering lifting based wavelet transforms. A wavelet transform is then applied to the baseband view, de-composing it into its sub-bands. High-frequency sub-bands are separately hard threshold. Wavelet-weights for high-frequency sub-bands are calculated and used to adjust threshold values for different sub-bands. The CS algorithm is then employed to generate measurements for each resulting sub-band. In the decoder side, the Basis Pursuit method is used to recover the high-frequency sub-bands. Results indicate that the proposed codec significantly outperforms the state-of-the-art CS-based multi-view image codecs. © 2012 The Institution of Engineering and Technology.
Iris Biometrics Recognition in Security Management
Application of iris recognition for human identification has significant potential for developing a robust identification system. This is due to the fact that iris pattern of individuals are unique, differentiable from left to right eye and is almost stable over the time. However, performance of the existing iris recognition systems depends on the signal processing algorithms they use for iris segmentation, feature extraction and template matching. Like any other signal processing system, the performance of the iris recognition system is depend on the existing level of noise in the image and can be deteriorated as the level of noise increases. The building block of the iris recognition systems, techniques to mitigate the effect of the noise in each stages, criteria to assess the performance of different iris recognition techniques and publicly available iris datasets are discussed in this chapter.
Multiresolution HVS and statistically based image coding scheme
In this paper a novel multiresolution human visual system and statistically based image coding scheme is presented. It decorrelates the input image into a number of subbands using a lifting based wavelet transform. The codec employs a novel statistical encoding algorithm to code the coefficients in the detail subbands. Perceptual weights are applied to regulate the threshold value of each detail subband that is required in the statistical encoding process. The baseband coefficients are losslessly coded. An extension of the codec to the progressive transmission of images is also developed. To evaluate the performance of the coding scheme, it was applied to a number of test images and its performance with and without perceptual weights is evaluated. The results indicate significant improvement in both subjective and objective quality of the reconstructed images when perceptual weights are employed. The performance of the proposed technique was also compared to JPEG and JPEG2000. The results show that the proposed coding scheme outperforms both coding standards at low compression ratios, while offering satisfactory performance at higher compression ratios. © Springer Science + Business Media, LLC 2009.
Statistical, DCT and vector quantisation-based video codec
The authors present a novel hybrid statistical, DCT and vector quantisation-based video-coding technique. In intra mode of operation, an input frame is divided into a number of non-overlapping pixel blocks. A discrete cosine transform then converts the coefficients in each block into the frequency domain. Coefficients with the same frequency index at different blocks are put together generating a number of matrices, where each matrix contains the coefficients of a particular frequency index. The matrix, which contains the DC coefficients, is losslessly coded. Matrices containing high frequency coefficients are coded using a novel statistical encoder. In inter mode of operation, overlapped block motion estimation / compensation is employed to exploit temporal redundancy between successive frames and generates a displaced frame difference (DFD) for each inter-frame. A wavelet transform then decomposes the DFD-frame into its frequency subbands. Coefficients in the detail subbands are vector quantised while coefficients in the baseband are losslessly coded. To evaluate the performance of the codec, the proposed codec and the adaptive subband vector quantisation (ASVQ) video codec, which has been shown to outperform H.263 at all bitrates, were applied to a number of test sequences. Results indicate that the proposed codec outperforms the ASVQ video codec subjectively and objectively at all bitrates. © 2008 The Institution of Engineering and Technology.
Development of stereo video codecs in latest multi-view extension of HEVC (MV-HEVC) with higher compression efficiency has been an active area of research. In this paper, a frame interleaved stereo video coding scheme based on MVHEVC standard codec is proposed. The proposed codec applies a reduced layer approach to encode the frame interleaved stereo sequences. A frame interleaving algorithm is developed to reorder the stereo video frames into a monocular video, such that the proposed codec can gain advantage from inter-views and temporal correlations to improve its coding performance. To evaluate the performance of the proposed codec; three standard multi-view test video sequences, named “Poznan_Street”, “Kendo” and “Newspaper1”, were selected and coded using the proposed codec and the standard MV-HEVC codec at different QPs and bitrates. Experimental results show that the proposed codec gives a significantly higher coding performance to that of the standard MV-HEVC codec at all bitrates.
Evaluation of Wavelet Transform Families in Image Resolution Enhancement
Wavelet based image enlargement technique
This paper presents an image enlargement technique using a wavelet transform. The proposed technique considers the low resolution input image as the wavelet baseband and estimates the information in high-frequency subbands from the wavelet high-frequency sub-bands of the input image using wavelet filters. The super-resolution image is finally generated by applying an inverse wavelet transform on the high resolution sub-bands. To evaluate the performance of the proposed image enlargement technique, five standard test images with a variety of frequency components were chosen and enlarged using the proposed technique and six state of the art algorithms. Experimental results show the proposed technique significantly outperforms the classical and nonclassical super-resolution methods, both subjectively and objectively.
One of the largest future applications of computer vision is in the healthcare industry. Computer vision tasks are generally implemented in diverse medical imaging scenarios, including detecting or classifying diseases, predicting potential disease progression, analyzing cancer data for advancing future research, and conducting genetic analysis for personalized medicine. However, a critical drawback of using Computer Vision (CV) approaches is their limited reliability and transparency. Clinicians and patients must comprehend the rationale behind predictions or results to ensure trust and ethical deployment in clinical settings. This demonstrates the adoption of the idea of Explainable Computer Vision (X-CV), which enhances vision-relative interpretability. Among various methodologies, attribution-based approaches are widely employed by researchers to explain medical imaging outputs by identifying influential features. This article solely aims to explore how attribution-based X-CV methods work in medical imaging, what they are good for in real-world use, and what their main limitations are. This study evaluates X-CV techniques by conducting a thorough review of relevant reports, peer-reviewed journals, and methodological approaches to obtain an adequate understanding of attribution-based approaches. It explores how these techniques tackle computational complexity issues, improve diagnostic accuracy and aid clinical decision-making processes. This article intends to present a path that generalizes the concept of trustworthiness towards AI-based healthcare solutions.
This paper presents a novel variance of subregions and discrete cosine transform based image-coding scheme. The proposed encoder divides the input image into a number of non-overlapping blocks. The coefficients in each block are then transformed into their spatial frequencies using a discrete cosine transform. Coefficients with the same spatial frequency index at different blocks are put together generating a number of matrices, where each matrix contains coefficients of a particular spatial frequency index. The matrix containing DC coefficients is losslessly coded to preserve its visually important information. Matrices containing high frequency coefficients are coded using a variance of sub-regions based encoding algorithm proposed in this paper. Perceptual weights are used to regulate the threshold value required in the coding process of the high frequency matrices. An extension of the system to the progressive image transmission is also developed. The proposed coding scheme, JPEG and JPEG2000were applied to a number of test images. Results show that the proposed coding scheme outperforms JPEG and JPEG2000 subjectively and objectively at low compression ratios. Results also indicate that the proposed codec decoded images exhibit superior subjective quality at high compression ratios compared to that of JPEG, while offering satisfactory results to that of JPEG2000.
Multiresolution, perceptual and vector quantization based video codec
This paper presents a novel Multiresolution, Perceptual and Vector Quantization (MPVQ) based video coding scheme. In the intra-frame mode of operation, a wavelet transform is applied to the input frame and decorrelates it into its frequency subbands. The coefficients in each detail subband are pixel quantized using a uniform quantization factor divided by the perceptual weighting factor of that subband. The quantized coefficients are finally coded using a quadtree-coding algorithm. Perceptual weights are specifically calculated for the centre of each detail subband. In the inter-frame mode of operation, a Displaced Frame Difference (DFD) is first generated using an overlapped block motion estimation/compensation technique. A wavelet transform is then applied on the DFD and converts it into its frequency subbands. The detail subbands are finally vector quantized using an Adaptive Vector Quantization (AVQ) scheme. To evaluate the performance of the proposed codec, the proposed codec and the adaptive subband vector quantization coding scheme (ASVQ), which has been shown to outperform H.263 at all bitrates, were applied to six test sequences. Experimental results indicate that the proposed codec outperforms the ASVQ subjectively and objectively at all bit rates. © 2011 Springer Science+Business Media, LLC.
However, achieving these qualities requires resolving a number of trade-offs between various properties during system design and operation. This paper reviews trade-offs in distributed replicated databases and provides a survey of recent research papers studying distributed data storage. The paper first discusses a compromise between consistency and latency that appears in distributed replicated data storages and directly follows from CAP and PACELC theorems. Consistency refers to the guarantee that all clients in a distributed system observe the same data at the same time. To ensure strong consistency, distributed systems typically employ coordination mechanisms and synchronization protocols that involve communication and agreement among distributed replicas. These mechanisms introduce additional overhead and latency and can dramatically increase the time taken to complete operations when replicas are globally distributed across the Internet. In addition, we study trade-offs between other system properties including availability, durability, cost, energy consumption, read and write latency, etc. In this paper we also provide a comprehensive review and classification of recent research works in distributed replicated databases. Reviewed papers showcase several major areas of research, ranging from performance evaluation and comparison of various NoSQL databases to suggest new strategies for data replication and putting forward new consistency models. In particular, we observed a shift towards exploring hybrid consistency models of causal consistency and eventual consistency with causal ordering due to their ability to strike a balance between operations ordering guarantees and high performance. Researchers have also proposed various consistency control algorithms and consensus quorum protocols to coordinate distributed replicas. Insights from this review can empower practitioners to make informed decisions in designing and managing distributed data storage systems as well as help identify existing gaps in the body of knowledge and suggest further research directions.
Brain tumor (BT) is an awful disease and one of the foremost causes of death in human beings. BT develops mainly in 2 stages and varies by volume, form, and structure, and can be cured with special clinical procedures such as chemotherapy, radiotherapy, and surgical mediation. With revolutionary advancements in radiomics and research in medical imaging in the past few years, computer-aided diagnostic systems (CAD), especially deep learning, have played a key role in the automatic detection and diagnosing of various diseases and significantly provided accurate decision support systems for medical clinicians. Thus, convolution neural network (CNN) is a commonly utilized methodology developed for detecting various diseases from medical images because it is capable of extracting distinct features from an image under investigation. In this study, a deep learning approach is utilized to extricate distinct features from brain images in order to detect BT. Hence, CNN from scratch and transfer learning models (VGG-16, VGG-19, and LeNet-5) are developed and tested on brain images to build an intelligent decision support system for detecting BT. Since deep learning models require large volumes of data, data augmentation is used to populate the existing dataset synthetically in order to utilize the best fit detecting models. Hyperparameter tuning was conducted to set the optimum parameters for training the models. The achieved results show that VGG models outperformed others with an accuracy rate of 99.24%, average precision of 99%, average recall of 99%, average specificity of 99%, and average f1-score of 99% each. The results of the proposed models compared to the other state-of-the-art models in the literature show better performance of the proposed models in terms of accuracy, sensitivity, specificity, and f1-score. Moreover, comparative analysis shows that the proposed models are reliable in that they can be used for detecting BT as well as helping medical practitioners to diagnose BT.
The intrinsic properties of the ambient illuminant significantly alter the true colors of objects within an image. Most existing color constancy algorithms assume a uniformly lit scene across the image. The performance of these algorithms deteriorates considerably in the presence of mixed illuminants. Hence, a potential solution to this problem is the consideration of a combination of image regional color constancy weighing factors (CCWFs) in determining the CCWF for each pixel. This paper presents a color constancy algorithm for mixed-illuminant scene images. The proposed algorithm splits the input image into multiple segments and uses the normalized average absolute difference (NAAD) of each segment as a measure for determining whether the segment’s pixels contain reliable color constancy information. The Max-RGB principle is then used to calculate the initial weighting factors for each selected segment. The CCWF for each image pixel is then calculated by combining the weighting factors of the selected segments, which are adjusted by the normalized Euclidian distances of the pixel from the centers of the selected segments. Experimental results on images from five benchmark datasets show that the proposed algorithm subjectively outperforms the state-of-the-art techniques, while its objective performance is comparable to those of the state-of-the-art techniques.
Application of wavelet transform for image camera source identification has been widely reported in the literature and the written techniques use different wavelets. Due to the wavelets’ diversity and properties, it is beneficial for the research community to identify the best-performing wavelets for this application. This paper presents results for assessing the performance of the conventional wavelet-based image camera source identification technique against forty-one wavelets from Daubechies, Biorthogonal, Symlets, and Coiflets wavelet families. VISION image dataset comprising 34,427 images captured by eleven camera brands of thirty-five models was used to generate experimental results. Hundred plane images from each camera brand dataset were randomly selected and used to generate experimental results, where 70% of each dataset’s images were used to compute the camera brand’s signature, and 30% of the images were used to assess the performance of the method. Normalized cross-correlation of the camera brand signature and calculated image noise were used to find the camera match. To compare the method’s performance when using different wavelets, a new assessment criterion was introduced and used to quantify the method’s performance across images of different camera brands. Results show that the conventional wavelet-based image camera source identification achieves its highest performance when it uses sym2 closely followed by coif1 wavelets.
Low Cost and Energy Efficient Hybrid Wireless Positioning System Using Wi-Fi and Bluetooth Technologies for Wearable Devices
In recent years, the application of Indoor Positioning Systems (IPS) has experienced a significant increase in demand, with the aging of the world’s population and their changing lifestyles. While outdoor positioning systems, such as the Global Positioning System (GPS), have significantly advanced over the years, indoor positioning has been restrained by the limitations of the employed technologies. This paper presents a hybrid wireless positioning system able to locate wearables indoor accurately. It is based on the Wi-Fi and Bluetooth technologies, by using trilateration to determine the position of Bluetooth low energy (BLE) wearable devices with an accuracy of up to 1.8 meters. A graphical user interface (GUI) was used to illustrate the performance of the proposed systems, by allowing users to visualize the captured data in two dimensions and three- dimensions in real-time.
Gas turbines are a key player in the energy generation sector and thus form a key component in energy systems as a critical infrastructure. The determination of key parameters in optimization and efficiency of the gas turbines are of utmost importance to increase their power conversion efficiency. This paper presents a simple power estimation model for a gas turbine considering all its parameters. 7412 multivariate data records from the UCI Machine Learning Repository were used in the development of a linear prediction model for estimating the turbine energy yield for a combined cycle power plant. Simulation results show that the inlet temperature of the turbine is the most critical parameter that predicts its energy yield capacity, while ambient atmospheric conditions of temperature, humidity and pressure do not predict its energy yield capacity.
This paper presents a Chainlet based Multi-Band Ear Recognition using Support Vector Machine (CMBER-SVM) algorithm. The proposed method divides the gray input image into a number of bands based on the intensity of its pixels, resembling a hyperspectral image. It then applies Canny edge detection on each resulting normalized band, extracting edges that represent the ear pattern in each band. The resulting binary edge maps are then flattened, generating a single binary edge map. This edge map is then split into non-overlapping cells and the Freeman chain code for each group of connected edges within each cell is calculated. A histogram of each group of contiguous four cells is calculated, and the results histograms are then normalized and concatenated to form a chainlet for the input image. The resulting chainlet histogram vectors of the images of the dataset are then used for training and testing a pairwise Support Vector Machine (SVM). Experimental results on images of two benchmark ear image datasets show that the proposed CMBER-SVM technique outperforms both the state of the art statistical and learning based ear recognition methods. Index Terms—ear recognition, chainlets, support vector machine, multi-band image generation
Principal Component Analysis (PCA) has been successfully applied to many applications, including ear recognition. This paper presents a 2D Wavelet based Multi-Band Principal Component Analysis (2D-WMBPCA) ear recognition method, inspired by PCA based techniques for multispectral and hyperspectral images. The proposed 2D-WMBPCA method performs a 2D non-decimated wavelet transform on the input image, dividing it into its wavelet subbands. Each resulting subband is then divided into a number of frames based on its coefficient’s values. The multi frame generation boundaries are calculated using either equal size or greedy hill climbing techniques. Conventional PCA is applied on each subband’s resulting frames, yielding its eigenvectors, which are used for matching. The intersection of the energy of the eigenvectors and the total number of features for each subband shows the number of bands which yield the highest matching performance. Experimental results on the images of two benchmark ear datasets, called IITD II and USTB I, demonstrated that the proposed 2D-WMBPCA technique significantly outperforms Single Image PCA by up to 56.79% and the eigenfaces technique by up to 20.37% with respect to matching accuracy. Furthermore, the proposed technique achieves very competitive results to those of learning based techniques at a fraction of their computational time and without needing to be trained.
Colour Constancy Adjustment Techniques
This chapter presents an overview on colour constancy adjustment techniques. The concept of colour constancy within digital images is first introduced and then some of the recent colour correction methods are discussed. Some publicly available benchmark standard image datasets, which are used by the researchers to assess the performance of the colour correction methods are introduced. These datasets contain both real and syntactical images of scenes illuminated by a single or multiple light source/s. Colour constancy quality assessment measures, which are widely used in the literature are also detailed. Finally, the performance of different colour correction methods on images of different benchmark image datasets are assessed and compared. The chapter demonstrates that the leaning based approaches outperform the statistical based algorithms at significantly higher computation costs. Moreover, their performances are very data dependent, while recent statistical based methods have slightly lower performance to those of the learning-based algorithms at significantly lower computation cost and data dependency.
This paper presents an algorithm to retrieve the true colour of an image captured under multiple illuminant. The proposed method uses a histogram analysis and K-means++ clustering technique to split the input image into a number of segments. It then determines normalised average absolute difference (NAAD) for each resulting segment’s colour component. If the NAAD of the segment’s component is greater than an empirically determined threshold. It assumes that the segment does not represent a uniform colour area, hence the segment’s colour component is selected to be used for image colour constancy adjustment. The initial colour balancing factor for each chosen segment’s component is calculated using the Minkowski norm based on the principal that the average values of image colour components are achromatic. It finally calculates colour constancy adjustment factors for each image pixel by fusing the initial colour constancy factors of the chosen segments weighted by the normalised Euclidian distances of the pixel from the centroids of the selected segments. Experimental results using benchmark single and multiple illuminant image datasets, show that the proposed method’s images subjectively exhibit highest colour constancy in the presence of multiple illuminant and also when image contains uniform colour areas.
Extreme presence of the source light in digital images decreases the performance of many image processing algorithms, such as video analytics, object tracking and image segmentation. This paper presents a color constancy adjustment technique, which lessens the impact of large unvarying color areas of the image on the performance of the existing statistical based color correction algorithms. The proposed algorithm splits the input image into several non-overlapping blocks. It uses the Average Absolute Difference (AAD) value of each block’s color component as a measure to determine if the block has adequate color information to contribute to the color adjustment of the whole image. It is shown through experiments that by excluding the unvarying color areas of the image, the performances of the existing statistical-based color constancy methods are significantly improved. The experimental results of four benchmark image datasets validate that the proposed framework using Gray World, Max-RGB and Shades of Gray statistics-based methods’ images have significantly higher subjective and competitive objective color constancy than those of the existing and the state-of-the-art methods’ images.
This paper presents a colour constancy algorithm for images of scenes lit by non-uniform light sources. The proposed method determines number of colour regions within the image using a histogram-based algorithm. It then applies the K-means++ algorithm on the input image, dividing the image into its segments. The proposed algorithm computes the normalized average absolute difference (NAAD) for each segment’s coefficients and uses it as a measure to determine if the segment’s coefficients have sufficient colour variations. The initial colour constancy adjustment factors for each segment with sufficient colour variation is calculated based on the principle that the average values of colour components of the image are achromatic. The colour constancy adjustment weighting factors (CCAWF) for each pixel of image are determined by fusing the CCAWFs of the segments’ with sufficient colour variations, weighted by their normalized Euclidian distance of the pixel from the center of the segments. Experimental results were generated using both indoor and outdoor benchmark images from the scene illuminated by single or multiple illuminants. Results show that the proposed method outperforms the state of the art techniques subjectively and objectively.
Ear recognition is a field in biometrics wherein images of the ears are used to identify individuals. Many techniques have been developed for ear recognition; however, most of the existing techniques have been tested on highresolution images taken in a laboratory environment. This research examines the performance of Principal Component Analysis (PCA) based ear recognition in conjunction with superresolution algorithms from low-resolution ear images. Ear images are first split into database and query images; the latter are first filtered and down-sampled, generating a set ear images of different low resolutions. The resulting low-resolution images are then enlarged to their original sizes using an assortment of neural network-based and statistical-based super-resolution methods. PCA is then applied to the images, generating their eigenvalues, which are used as features for matching. Experimental results on the images of a benchmark dataset show that the statistical-based super-resolution techniques, namely those that are wavelet-based, outperform other algorithms with respect to ear recognition accuracy.
A Novel Unequal Error Protection Scheme for Low Bit-Rate Mobile Video Transmission
A Novel Progressive Image Coding Scheme for Handheld Videophone Applications
Progressive Multi-resolution HVS and Statistically Based Image Codec
Single Image Ear Recognition Using Wavelet-Based Multi-Band PCA
Principal Component Analysis (PCA) has been successfully used for many applications, including ear recognition. This paper presents a 2D Wavelet based Multi-Band PCA (2DWMBPCA) method, inspired by PCA based techniques for multispectral and hyperspectral images, which have shown a significantly higher performance to that of standard PCA. The proposed method performs 2D non-decimated wavelet transform on the input image dividing the image into its subbands. It then splits each resulting subband into a number of bands evenly based on the coefficient values. Standard PCA is then applied on each resulting set of bands to extract the subbands eigenvectors, which are used as features for matching. Experimental results on images of two benchmark ear image datasets show that the proposed 2D-WMBPCA significantly outperforms both the standard PCA method and the eigenfaces method.
Principal Component Analysis (PCA) has been successfully used for many applications, including ear recognition. This paper presents a 2D Wavelet based Multi-Band PCA (2DWMBPCA) method, inspired by PCA based techniques for multispectral and hyperspectral images, which have shown a significantly higher performance to that of standard PCA. The proposed method performs 2D non-decimated wavelet transform on the input image dividing the image into its subbands. It then splits each resulting subband into a number of bands evenly based on the coefficient values. Standard PCA is then applied on each resulting set of bands to extract the subbands eigenvectors, which are used as features for matching. Experimental results on images of two benchmark ear image datasets show that the proposed 2D-WMBPCA significantly outperforms both the standard PCA method and the eigenfaces method.
Applications in harsh environments greatly suffer from intermittency faults in their interconnections/wirings. Due to the erratic behavior of intermittency that causes signal irregularities, it is tough to distinguish irregularities from an actual transmitted signal, particularly in the earlier stages where signal abnormalities mainly resemble noise. This paper explores step changes in the resistance of a wire caused by broken strands as a failure parameter. Thus, a test rig was designed to emulate the ageing mechanism of the wire. with results of the study highlighting that resistance step changes could effectively be used to locate intermittency faults in low-power cable applications.
Principal Component Analysis (PCA) has been successfully used for many application including ear recognition. However, its performance is limited due to its significant data dependency. This paper presents a two dimensional multi-band PCA (2D-MBPCA) method, which has shown a significantly higher performance to that of the PCA. The proposed method divided the input gray image into a number of images, based on the intensity of its pixels using either a dynamic or predefined equal rang thresholds’ values. PCA is then applied on the resulting set of images to extract their features. The resulting features are used to find the best match. The application of the proposed 2D-MBPCA for ear recognition using two benchmark ear image datasets, shows th
A Progressive Statistical and Discrete Cosine Transform based Image codec
Digital camera sensors are designed to record all incident light from a captured scene but they are unable to distinguish between the colour of the light source and the true colour of objects. The resulting captured image exhibits a colour cast toward the colour of light source. This paper presents a colour constancy algorithm for images of scenes lit by non-uniform light sources. The proposed algorithm uses a histogram-based algorithm to determine the number of colour regions. It then applies the K-means++ algorithm on the input image, dividing the image into its segments. The proposed algorithm computes the Normalized Average Absolute Difference (NAAD) for each segment and uses it as a measure to determine if the segment has sufficient colour variations. The initial colour constancy adjustment factors for each segment with sufficient colour variation is calculated. The Colour Constancy Adjustment Weighting Factors (CCAWF) for each pixel of the image are determined by fusing the CCAWFs of the segments, weighted by their normalized Euclidian distance of the pixel from the center of the segments. Results show that the proposed method outperforms the statistical techniques and its images exhibit significantly higher subjective quality to those of the learning-based methods. In addition, the execution time of the proposed algorithm is comparable to statistical-based techniques and is much lower than those of the state-of-the-art learning-based methods.
This paper presents a PCA-based iris recognition method called Intensity Separation Curvelet based PCA (ISCPCA). The proposed method uses Canny Edge detection and the Hough transform to extract and rectangularize the iris from the input eye image. The second generation Fast Digital Curvelet Transform (FDCT) is then applied to the resulting image, dividing it into its subbands. The resulting complex subbands coefficients within the same level are concatenated, generating two single frames. The coefficients in each resulting frame are then normalized and evenly divided into a preselected number of bands. The coefficient matrices within each frame are then vectorized and concatenated, generating a single 2D matrix. Conventional PCA is then performed on the resulting 2D matrix extracting its eigenvectors which are used for iris matching. The Euclidean distance is used as a measure to quantify the closeness of different iris images. Experimental results on images from the CASIA-Iris-Interval benchmark eye image dataset show that the proposed ISC-PCA technique significantly outperforms the state of the art PCA based methods, and achieves competitive results to those of the learning based techniques.
Physics-guided Synthetic CFD Data Generation and Explainable Deep Learning Methods for Automated Flow Pattern Classification
Computational Fluid Dynamics (CFD) is widely used to analyze fluid flow patterns, but interpreting these patterns manually is time-consuming and requires expert knowledge. This paper introduces a method that combines physics-guided synthetic CFD data generation with explainable deep learning models to automate flow pattern classification. The approach involves generating synthetic data using physics-based simulations and training deep learning models—specifically Convolutional Neural Networks (CNNs) and Vision Transformers (ViTs)—to classify flow patterns. Explainable AI techniques are applied to interpret the model decisions. The results show that Vision Transformers outperform CNNs in classification accuracy and offer better interpretability.
Source Camera Identification Techniques: A Survey
Successful investigation and prosecution of major crimes like child pornography, insurance claims, movie piracy, traffic monitoring, and scientific fraud among others, largely depends on the availability of water-tight evidence to prove the case beyond any reasonable doubt. When the evidence required in investigating and prosecuting such crimes involves digital images/ videos, there is a need to prove without an iota of doubt the source camera/device of the image in question. Much research has been reported to address this need over the past decade. The proposed methods can be divided into brand or model-level identification or known imaging device matching techniques. This paper investigates the effectiveness of the existing image/video source camera identification techniques, which use both intrinsic hardware artefacts-based techniques like sensor pattern noise, and lens optical distortion, and software artefacts-based techniques like colour filter array, and auto white balancing, to determine their strengths and weaknesses. Publicly available benchmark image/video datasets and assessment criteria to quantify the performance of different methods are presented and the performance of the existing methods is compared. Finally, directions for further research on image source identification are given.
Image Band-Distributive PCA Based Face Recognition Technique
This paper presents an Image Band-Distributive PCA (IBD-PCA) based technique for face recognition. The proposed method consists of four steps. In the first step, the reference image is pre-processed by converting its pixel values and performing histogram equalization to increase its contrast. In the second step, the equal-size boundary calculation method is used to calculate the boundary splitting values to divide the input image into multiple images with respect to band intensities of pixels. In the third step, Principal Component Analysis (PCA) is used to extract features from the images which will then be used as the input for the fourth step. In the last step, matching is performed by calculating the Euclidean distance between principal components. The proposed technique has been tested on the ORL Face Database and Yale Face Database. The experimental results demonstrate that the proposed technique outperforms other techniques on the same database.
HyperSpectral Imaging (HSI) plays a pivotal role in various fields, including medical diagnostics, where precise human vein detection is crucial. HyperSpectral (HS) image data are very large and can cause computational complexities. Dimensionality reduction techniques are often employed to streamline HS image data processing. This paper presents a HS image dataset encompassing left- and right-hand images captured from 100 subjects with varying skin tones. The dataset was annotated using anatomical data to represent vein and non-vein areas within the images. This dataset is utilised to explore the effectiveness of dimensionality reduction techniques, namely: Principal Component Analysis (PCA), Folded PCA (FPCA), and Ward’s Linkage Strategy using Mutual Information (WaLuMI) for vein detection. To generate experimental results, the HS image dataset was divided into train and test datasets. Optimum performing parameters for each of the dimensionality reduction techniques in conjunction with the Support Vector Machine (SVM) binary classification were determined using the Training dataset. The performance of the three dimensionality reduction-based vein detection methods was then assessed and compared using the test image dataset. Results show that the FPCA-based method outperforms the other two methods in terms of accuracy. For visualization purposes, the classification prediction image for each technique is post-processed using morphological operators, and results show the significant potential of HS imaging in vein detection.
The successful investigation and prosecution of significant crimes, including child pornography, insurance fraud, movie piracy, traffic monitoring, and scientific fraud, hinge largely on the availability of solid evidence to establish the case beyond any reasonable doubt. When dealing with digital images/videos as evidence in such investigations, there is a critical need to conclusively prove the source camera/device of the questioned image. Extensive research has been conducted in the past decade to address this requirement, resulting in various methods categorized into brand, model, or individual image source camera identification techniques. This paper presents a survey of all those existing methods found in the literature. It thoroughly examines the efficacy of these existing techniques for identifying the source camera of images, utilizing both intrinsic hardware artifacts such as sensor pattern noise and lens optical distortion, and software artifacts like color filter array and auto white balancing. The investigation aims to discern the strengths and weaknesses of these techniques. The paper provides publicly available benchmark image datasets and assessment criteria used to measure the performance of those different methods, facilitating a comprehensive comparison of existing approaches. In conclusion, the paper outlines directions for future research in the field of source camera identification.
This paper presents an Image Band-Distributive PCA (IBD-PCA) based technique for face recognition. The proposed method consists of four steps. In the first step, the reference image is pre-processed by converting its pixel values and performing histogram equalization to increase its contrast. In the second step, the equal-size boundary calculation method is used to calculate the boundary splitting values to divide the input image into multiple images with respect to band intensities of pixels. In the third step, Principal Component Analysis (PCA) is used to extract features from the images which will then be used as the input for the fourth step. In the last step, matching is performed by calculating the Euclidean distance between principal components. The proposed technique has been tested on the ORL face database and Yale face database. The experimental results demonstrate that the proposed technique outperforms other techniques on the same database.
HyperVein: A Dataset for Human Vein Detection from Hyperspectral Images
This folder contains a subset of hyperspectral images from the "HyperVein: A Dataset for Human Vein Detection from Hyperspectral Images" dataset, which originally consisted of 200 images. This folder includes: 1. A hyperspectral image dataset containing 10 hyperspectral images representing the left and right hand image capture of 5 volunteer participants. The images named with 'a' (e.g. 1a.bil) are left-hand images and those named with 'b' (e.g. 1b.bil) are right-hand images. Both the image file in band interleaved format (.bil) and the header (.hdr) file have been provided for each image. 2. A dataset containing the ground truth for each of the hyperspectral images. The ground truths are of dimensions 1024 by 1024 representing the selected ROI used for the vein detection experiment. This region can be mapped out using dimensions: row = 291:1314, column = 360:1383. A folder containing the ground truth images scaled to 0 and 255 for visualization purposes is also included. 3. An example MATLAB file for reading and displaying the hyperspectral images named 'Read_HS_Image'. Make sure to download and install the Hyperspectral Image Processing Toolbox on MATLAB first before running codes.
Improving energy efficiency is a major concern in residential buildings for economic prosperity and environmental stability. Despite growing interest in this area, limited research has been conducted to systematically identify the primary factors that influence residential energy efficiency at scale, leaving a significant research gap. This paper addresses the gap by exploring the key determinant factors of energy efficiency in residential properties using a large-scale energy performance certificate dataset. Dimensionality reduction and feature selection techniques were used to pinpoint the key predictors of energy efficiency. The consistent results emphasise the importance of CO2 emissions per floor area, current energy consumption, heating cost current, and CO2 emissions current as primary determinants, alongside factors such as total floor area, lighting cost, and heated rooms. Further, machine learning models revealed that Random Forest, Gradient Boosting, XGBoost, and LightGBM deliver the lowest mean square error scores of 6.305, 6.023, 7.733, 5.477, and 5.575, respectively, and demonstrated the effectiveness of advanced algorithms in forecasting energy performance. These findings provide valuable data-driven insights for stakeholders seeking to enhance energy efficiency in residential buildings. Additionally, a customised machine learning interface was developed to visualise the multifaceted data analyses and model evaluations, promoting informed decision-making.
Successful investigation and prosecution of major crimes like child pornography, insurance claims, movie piracy, traffic monitoring, and scientific fraud among others, largely depends on the availability of water-tight evidence to prove the case beyond any reasonable doubt. When the evidence required in investigating and prosecuting such crimes involves digital images/ videos, there is a need to prove without an iota of doubt the source cam-era/device of the image in question. Much research has been reported to address this need over the past decade. The proposed methods can be divided into brand or model-level identification or known imaging device matching techniques. This paper investigates the effectiveness of the existing image/video source camera identification techniques, which use both intrinsic hardware artifacts-based techniques like sensor pattern noise, and lens optical distortion, and software artifacts-based techniques like color filter array, and auto white balancing, to determine their strengths and weaknesses. Publicly available benchmark image/video datasets and assessment criteria to quantify the performance of different methods are presented and the performance of the existing methods is compared. Finally, directions for further research on image source identification are given
Studies have shown that mixed resolution based video codecs, also known as asymmetric spatial inter/intra view video codecs are successful in efficiently coding videos for low bitrate trans-mission. In this paper a HEVC based spatial resolution scaling type of mixed resolution coding model for frame interleaved multiview videos is presented. The proposed codec is designed such that the information in intermediate frames of the center and neighboring views are down-sampled, while the frames still retaining the original size. The codec’s reference frames structure is designed to efficiently encode frame interleaved multi-view videos using a HEVC based mixed resolution codec. The multi-view test video sequences were coded using the proposed codec and the standard MV-HEVC. Results show that the pro-posed codec gives significantly higher coding performance over the MV- HEVC codec at low bitrates.
This paper presents an iris segmentation algorithm. The proposed technique applies a histogram based method on the input eye image extracting a point within the pupil. The image is then intensity sampled over M equiangular radial scan line, generating M 1-dimensional signals. A Fuzzy multi-scale edge detection algorithm is then applied to each of the resulting radii signals, to accurately detect and locate one positive edge point from the signal. A uniform cubic B-spline approximation method is further applied to the detected edges determining the iris outer boundary. The histogram of the area within the extracted outer iris bondary of the eye image is finaly used to extract the pupil outer bondary. Experimental results on a number of eye test images taken under visible wavelenght from UBIRISv.1 and UBIRISv.2 databases show that the proposed segmentation method accurately extracts the iris boundaries.
Colour constancy (CC) is the ability to perceive the true colour of the scene on its image regardless of the scene’s illuminant changes. Colour constancy is a significant part of the digital image processing pipeline, more precisely, where true colour of the object is needed. Most existing CC algorithms assume a uniform illuminant across the whole scene of the image, which is not always the case. Hence, their performance is influenced by the presence of multiple light sources. This paper presents a colour constancy algorithm using image texture for uniform/non-uniformly lit scene images. The propose algorithm applies the K-means algorithm to segment the input image based on its different colour feature. Each segment’s texture is then extracted using the Entropy analysis algorithm. The colour information of the texture pixels is then used to calculate initial colour constancy adjustment factor for each segment. Finally, the colour constancy adjustment factors for each pixel within the image is determined by fusing the colour constancy of all segment regulated by the Euclidian distance of each pixel from the centre of the segments. Experimental results on both single and multiple illuminant image datasets show that the proposed algorithm outperforms the existing state of the art colour constancy algorithms, particularly when the images lit by multiple light sources.
As materials undergo large-scale yielding or exhibit large sizes of fracture process zone in the crack tip region, multi-parameter fracture concepts should be employed to describe the complex crack-tip stress-strain fields. Fracture resistance curves (R-curves) are an established tool in characterizing the entire fracture process of such materials. However, for complex materials such as bituminous mixtures, the development of these curves is subject to experimental and computational intricacies. In this research, a framework is developed to automate the construction of R-curves for normal and rubberized asphalt concrete (AC) mixtures. AC mixtures are produced using PG58–22 and PG58–28 binders. Limestone and siliceous aggregates are used, and three binder contents are considered for the mixtures. Single-edge notched beam (SE(B)) fracture testing is conducted on AC beams with two different notch patterns. A convolutional neural network (CNN) model is developed and trained over 1260 test images with varying temperatures, notch geometries, and setups. The CNN model was used to detect the growing crack on the beam surface and each crack-detected image was sent to an image processing framework to measure the crack length. Crack extension increments were calculated and synchronized with test time and magnitude of load, load-line displacement, and cumulative fracture energy, and the R-curve could be constructed. A training accuracy of 0.91 was obtained for the model and a loss of below 0.10 as a result of a hyperparameter tuning indicating reliable classifications by the CNN architecture. The R-curves showed desirable agreement for control mixtures at temperatures of 0 °C and −15 °C. As the mixtures are rubberized, the R-curves showed favorable agreement in the crack blunting phase, transition zone, as well as the unstable propagation phase at −20 °C. Cohesive energy magnitudes were compared for the two methods with a Pearson coefficient of 0.81 while fracture rate and fracture energy magnitudes showed favorably close magnitudes with coefficients of 0.89 and 0.98 respectively.
This paper explores the evolution and contemporary significance of Radio Frequency Identification (RFID) technology, focusing on its integration in the retail sector. It synthesizes theoretical foundations, practical insights, and future projections to provide a holistic understanding of RFID's implications for retailers, with a structured analysis of key components and case studies.
The energy sector plays a vital role in driving environmental and social advancements. Accurately predicting energy demand across various time frames offers numerous benefits, such as facilitating a sustainable transition and planning of energy resources. This research focuses on predicting energy consumption using three individual models: Prophet, eXtreme Gradient Boosting (XGBoost), and long short-term memory (LSTM). Additionally, it proposes an ensemble model that combines the predictions from all three to enhance overall accuracy. This approach aims to leverage the strengths of each model for better prediction performance. We examine the accuracy of an ensemble model using Mean Absolute Error (MAE), Mean Absolute Percentage Error (MAPE), and Root Mean Square Error (RMSE) through means of resource allocation. The research investigates the use of real data from smart meters gathered from 5567 London residences as part of the UK Power Networks-led Low Carbon London project from the London Datastore. The performance of each individual model was recorded as follows: 62.96% for the Prophet model, 70.37% for LSTM, and 66.66% for XGBoost. In contrast, the proposed ensemble model, which combines LSTM, Prophet, and XGBoost, achieved an impressive accuracy of 81.48%, surpassing the individual models. The findings of this study indicate that the proposed model enhances energy efficiency and supports the transition towards a sustainable energy future. Consequently, it can accurately forecast the maximum loads of distribution networks for London households. In addition, this work contributes to the improvement of load forecasting for distribution networks, which can guide higher authorities in developing sustainable energy consumption plans.
Advancements in digital technologies have transformed the world by providing more opportunities and possibilities. However, elderly persons have several challenges utilizing modern technology, leading to digital exclusion, which can negatively impact sustainable development. This research attempts to address the current digital exclusion by addressing the challenges older people face considering evolving digital technologies, focusing on economic, social, and environmental sustainability. Three distinct goals are pursued in this study: to perform a detailed literature review to identify gaps in the current understanding of digital exclusion among the elderly, to identify the primary factors affecting digital exclusion in the elderly, and to analyze the patterns and trends in different countries, with a focus on differentiating between High-Income Countries (HICs) and Lower Middle-Income Countries (LMICs). The research strategies used in this study involve a combination of a literature review and a quantitative analysis of the digital exclusion data from five cohorts. This study uses statistical analysis, such as PCA, chi-square test, one-way ANOVA, and two-way ANOVA, to present a complete assessment of the digital issues that older persons experience. The expected results include the identification of factors influencing the digital divide and an enhanced awareness of how digital exclusion varies among different socio-economic and cultural settings. The data used in this study were obtained from five separate cohorts over a five-year period from 2019 to 2023. These cohorts include ELSA (UK), SHARE (Austria, Germany, France, Estonia, Bulgaria, and Romania), LASI (India), MHAS (Mexico), and ELSI (Brazil). It was discovered that the digital exclusion rate differs significantly across HICs and LMICs, with the UK having the fewest (11%) and India having the most (91%) digitally excluded people. It was discovered that three primary factors, including socio-economic status, health-related issues, and age-related limitations, are causing digital exclusion among the elderly, irrespective of the income level of the country. Further analysis showed that the country type has a significant influence on the digital exclusion rates among the elderly, and age group plays an important role in digital exclusion. Additionally, significant variations were observed in the life satisfaction of digitally excluded people within HICs and LMICs. The interaction between country type and digital exclusion also showed a major influence on the health rating. This study has a broad impact since it not only contributes to what we know academically about digital exclusion but also has practical applications for communities. By investigating the barriers that prevent older people from adopting digital technologies, this study will assist in developing better policies and community activities to help them make use of the benefits of the digital era, making societies more equitable and connected. This paper provides detailed insight into intergenerational equity, which is vital for the embedding principles of sustainable development. Furthermore, it makes a strong case for digital inclusion to be part of broader efforts (and policies) for creating sustainable societies.
This research highlights the importance of Emotion Aware Technologies (EAT) and their implementation in serious games to assist children with Autism Spectrum Disorder (ASD) in developing social-emotional skills. As AI is gaining popularity, such tools can be used in mobile applications as invaluable teaching tools. In this paper, a new AI framework application is discussed that will help children with ASD develop efficient social-emotional skills. It uses the Jetpack Compose framework and Google Cloud Vision API as emotion-aware technology. The framework is developed with two main features designed to help children reflect on their emotions, internalise them, and train them how to express these emotions. Each activity is based on similar features from literature with enhanced functionalities. A diary feature allows children to take pictures of themselves, and the application categorises their facial expressions, saving the picture in the appropriate space. The three-level minigame consists of a series of prompts depicting a specific emotion that children have to match. The results of the framework offer a good starting point for similar applications to be developed further, especially by training custom models to be used with ML Kit.
The Sensor Pattern Noise (SPN) extracted from digital pictures can be interpreted as a unique sensor fingerprint for a digital camera and can be used to perform source identification of digital cameras. Scene details can contaminate SPN signatures. This paper presents a method to extract the SPN by applying non-decimated wavelet transform to digital pictures and then disinfect the contaminated SPN in order to improve the identification rate of the SPNs. The coefficients within the resulting wavelet high frequency sub-bands are filtered to extract the SPN of the image. By using non-decimated wavelet transform, we perform a two-step comparison technique that first isolates all the contaminated components of the SPN and neutralise these components from a contaminated SPN. The reinforced SPN is then matched against the corresponding components in the reference camera fingerprint. The two-step comparison technique provides a reinforced SPN of reduced contamination for the matching against the camera reference fingerprint. Experimental results were performed using images of ten cameras to identify the source camera of the images. Results show that the proposed technique generates superior results to that of the non-reinforced SPNs.
There has been increasing demand for multiview video transmission over band limited channel over past years and various techniques have been proposed to fulfil this need. In this paper, a High Efficiency Video Codec (HEVC) based spatial resolution scaling type of mixed resolution coding model, MRHEVC-MVC, for frame interleaved multiview videos is presented. However, enabling the HEVC to encode video with different frame resolutions is a challenge due to the coding tree partitioning used by the codec. This has been overcome by super-imposing the low resolution replica of each full resolution frame on their respective decoded picture buffer and setting the remaining space of the frame buffer to zero. The codec’s reference frames structure is designed to efficiently encode frame interleaved multiview videos using a HEVC based mixed resolution codec. The proposed MRHEVC-MVC codec has been tested against the standard multiview extension of high efficiency video codec (MV-HEVC) for “Balloon”, “Newspaper1”, “Undo_Dancer”, “Kendo” and ““Poznan_Street” standard multiview video sequences. Results show that the proposed codec gives significantly higher coding performance to that of the MV-HEVC codec at low bitrate both subjectively and objectively.
In the nuclear power industry, safety and reliability are of the utmost importance. Sensors and actuators are integral components in such systems, and potential faults may adversely impact system performance. It is therefore imperative to design a fault detection and diagnosis (FDD) system that achieves the highest standards of safety. This paper presents a machine learning-based fault detection and diagnosis (FDD) technique for actuators and sensors in a pressurized water reactor (PWR). In the proposed FDD framework, faults are first detected using a shallow neural network. Second, fault diagnosis is performed using 15 different classifiers provided in the MATLAB Classification Learner toolbox, including support vector machine (SVM), K-nearest neighbor (KNN), and ensemble. Several classifiers were found to provide superior classification performance, including medium KNN, cubic KNN, cosine KNN, weighted KNN, fine Gaussian SVM, quadratic SVM, medium Gaussian SVM, coarse Gaussian, bagged trees, and subspace KNN. The accuracy of the FDD approach was demonstrated using a set of simulation results.
This paper presents a High Efficiency Video Codec (HEVC) based spatial mixed-resolution stereo video codec. The proposed codec applies a frame interleaving algorithm to reorder the stereo video frames into a monoscopic video. The challenge for mixed-resolution video coding is to enable the codec to encode frames with different frame resolutions. This issue is addressed by superimposing a low resolution replica of the decoded I-frame on its respective decoded picture, where remaining space of the frame is set to zero. This significantly reduces the computation cost for finding the best match. The proposed codec’s reference frames structure is designed to efficiently exploit both temporal and inter-view correlations. Performance of the proposed codec is assessed using five standard multiview video datasets and benchmarked against that of the anchor and the state-of-the-art techniques. Results show that the proposed codec yields significantly higher coding performance compared to the anchor and state-of-the-art techniques.
Source Camera identification of digital images can be performed by matching the sensor pattern noise (SPN) of the images with that of the camera reference signature. This paper presents a non-decimated wavelet based source camera identification method for digital images. The proposed algorithm applies a non-decimated wavelet transform on the input image and split the image into its wavelet sub-bands. The coefficients within the resulting wavelet high frequency sub-bands are filtered to extract the SPN of the image. Cross correlation of the image SPN and the camera reference SPN signature is then used to identify the most likely source device of the image. Experimental results were generated using images of ten cameras to identify the source camera of the images. Results show that the proposed technique generates superior results to that of the state of the art wavelet based source camera identification.
This study proposes a feedback linearization-based control using a dynamic neural network to control a pressurized water-type nuclear power plant. The nonlinear plant model adopted in this study is characterized by five inputs, five outputs and, 38 state variables. The model is linearized through dynamic neural network-based system identification and feedback linearization. The proportional-integral-derivative (PID) controller is subsequently applied to the linearized process. The effectiveness of the proposed approach is demonstrated by simulations on different subsystems of a pressurized water reactor nuclear power plant model. Simulation results show that the proposed strategy offers good performance and is capable of effectively tracking the reference under disturbances.
The standard HEVC codec and its extension for coding multiview videos, known as MV-HEVC, have proven to deliver improved visual quality compared to its predecessor, H.264/MPEG-4 AVC’s multiview extension, H.264-MVC, for the same frame resolution with up to 50% bitrate savings. MV-HEVC’s framework is similar to that of H.264-MVC, which uses a multi-layer coding approach. Hence, MV-HEVC would require all frames from other reference layers decoded prior to decoding a new layer. Thus, the multi-layer coding architecture would be a bottleneck when it comes to quicker frame streaming across different views. In this paper, an HEVC-based Frame Interleaved Stereo/Multiview Video Codec (HEVC-FISMVC) that uses a single layer encoding approach to encode stereo and multiview video sequences is presented. The frames of stereo or multiview video sequences are interleaved in such a way that encoding the resulting monoscopic video stream would maximize the exploitation of temporal, inter-view, and cross-view correlations and thus improving the overall coding efficiency. The coding performance of the proposed HEVC-FISMVC codec is assessed and compared with that of the standard MV-HEVC’s performance for three standard multi-view video sequences, namely: “Poznan_Street”, “Kendo” and “Newspaper1”. Experimental results show that the proposed codec provides more substantial coding gains than the anchor MV-HEVC for coding both stereo and multi-view video sequences.
Aflatoxin contamination poses a significant risk to all nuts, including pistachios, during harvest, storage, and processing. Dietary exposure to aflatoxins can lead to severe toxic and carcinogenic effects in humans. To safeguard human and animal health, aflatoxin legislation sets maximum permissible levels for aflatoxins in food products, including pistachios. Consequently, imported pistachios undergo rigorous aflatoxin contamination testing. Traditional methods for measuring aflatoxin levels, such as High-Performance Liquid Chromatography (HPLC), HPLC with Mass Spectrometry, and Enzyme-Linked Immunosorbent Assay (ELISA), although precise, are destructive, costly, and time-consuming. This paper investigates the application of emerging technologies, including Hyperspectral Imaging, Chromatographic Test Strips, Luminescent Metal-Organic Frameworks, spectroscopic methods, machine vision, and advanced artificial intelligence models, to develop a non-intrusive, real-time system for aflatoxin detection in pistachio nuts. Additionally, it outlines a comprehensive strategy to protect public health, mitigate economic losses estimated at \$932 million annually, and sustain the pistachio industry.
Aflatoxin contamination in pistachios, caused by Aspergillus flavus and Aspergillus parasiticus, poses significant risks to food safety and global trade due to its carcinogenic properties. This review examines traditional detection methods such as High-Performance Liquid Chromatography and Enzyme-Linked Immunosorbent Assay. Although these techniques are highly precise, they are costly, destructive, and impractical for smallholder farmers. Emerging nondestructive technologies enable rapid, accurate detection without destroying the sample, particularly when Hyperspectral Imaging (HSI) is combined with machine learning. Regulatory thresholds such as the European Union (EU) 8 µg/kg limit for AFB1 create challenges for producers and exporters, especially since HSI methods often lack the precision required for validated quantitative regression at this level on naturally contaminated pistachio kernels. High implementation costs, limited regulatory guidance, and calibration demands hinder its adoption. Climate change heightens contamination risks, calling for predictive models that integrate HSI with environmental data. To support equitable access, especially for smallholder farmers, reducing costs, standardizing protocols, and enhancing global cooperation are essential. These measures will strengthen food safety and regulatory compliance in pistachio production.
Source Camera Identification (SCI) is essential in digital image forensics, enabling reliable attribution of images to their originating devices for legal, investigative, and security applications. Yet, existing SCI methods often struggle under diverse imaging conditions due to scene-dependent noise and texture interference. This thesis advances SCI through four key contributions. First, a systematic evaluation of forty-two wavelets using the VISION dataset (34,427 images, 35 camera models, 11 brands) identified cdf9/7 as the most effective for Sensor Pattern Noise (SPN) extraction, followed by sym2 and coif1. Second, an Improved Camera Source Identification using Wavelet Noise Residuals and Texture Filtering (ICSI-WNRTF) method was developed to suppress high-texture regions, achieving 99% accuracy for model identification and 98% for device attribution. Third, a Curvelet-Based Camera Source Identification Leveraging Image Smooth Regions (CBCSI-SR) framework was introduced. By exploiting multi-scale directional features and isolating smooth regions, it achieved 99.6% model-level and 98.9% device-level accuracy while reducing false decisions. Finally, a Deep Learning-Based Texture Exclusion for Source Camera Identification (DLTESCI) approach combined texture suppression with a fine-tuned ResNet50, reaching 99.7% accuracy and outperforming contemporary methods across Accuracy, Precision, Recall, FPR, and FNR. Together, these contributions establish a progression from wavelet-based to curvelet-based and deep learning-driven SCI techniques, delivering robust, scalable, and highly precise solutions for forensic applications.
Source Camera Identification using Sensor Pattern Noise
Source Camera Identification (SCI) is essential in digital image forensics, enabling reliable attribution of images to their originating devices for legal, investigative, and security applications. Yet, existing SCI methods often struggle under diverse imaging conditions due to scene-dependent noise and texture interference. This thesis advances SCI through four key contributions. First, a systematic evaluation of forty-two wavelets using the VISION dataset (34,427 images, 35 camera models, 11 brands) identified cdf9/7 as the most effective for Sensor Pattern Noise (SPN) extraction, followed by sym2 and coif1. Second, an Improved Camera Source Identification using Wavelet Noise Residuals and Texture Filtering (ICSI-WNRTF) method was developed to suppress high-texture regions, achieving 99% accuracy for model identification and 98% for device attribution. Third, a Curvelet-Based Camera Source Identification Leveraging Image Smooth Regions (CBCSI-SR) framework was introduced. By exploiting multi-scale directional features and isolating smooth regions, it achieved 99.6% model-level and 98.9% device-level accuracy while reducing false decisions. Finally, a Deep Learning-Based Texture Exclusion for Source Camera Identification (DLTESCI) approach combined texture suppression with a fine-tuned ResNet50, reaching 99.7% accuracy and outperforming contemporary methods across Accuracy, Precision, Recall, FPR, and FNR. Together, these contributions establish a progression from wavelet-based to curvelet-based and deep learning-driven SCI techniques, delivering robust, scalable, and highly precise solutions for forensic applications.
Disparity Compensated View Filtering Wavelet Based Multiview Image Code Using Lagrangian Optimization
This paper presents a disparity compensated view filtering wavelet based multiview image coding scheme. The proposed codec decorrelates the input views into their frequency subbands using a disparity compensated lifting based wavelet transform. The codec then applies a 2D wavelet transform on baseband-image tranforming it into a number of subbands. The general form of the Lagrangian rate distortion optimization algorithm is then modified and used along with the steepest descent algorithm to assign bits among the different subbands in such a way that it minimizes the PSNR variance of the decoded images. Two sets of experimental results are generated using three sets of multiview test images. In the first set of experiments the effect of using a weighted λ for high frequency images on the PSNR variance of the decoded images is investigated. In the second set of experiments, performance of the proposed codec in reducing the variance of the PSNR of the decoded images is investigated. Results indicate that the proposed codec gives lower PSNR variance among the decoded images compared to the basic form of the codec at all bitrates. ©2008 IEEE.
Impact of Camera Separation on the Performance of an H.264/AVC based Stereoscopic Video Codec
This paper investigates the impact of camera separation on the performance of an H.264/AVC based stereo-vision video codec. To achieve this, the multi-frame referencing property of H.264/AVC has been employed and the standard H.264/AVC reference software has been modified to support stereoscopic video coding. Experimental results were generated using two sets of wide baseline convergent multi-view test videos: Breakdancers and Ballet. To generate a set of synchronized stereo-videos from the same scene with different inter-camera angles, all possible camera pairs are generated and classified according to their inter-camera angles. The resulting sets of stereo videos are coded using a H.264/AVC based stereo-vision and simulcast coding schemes at different bitrates. Results indicate that the stereo-vision codec outperforms the simulcast coding by up to 3.9dB at lower inter-camera angles and it deteriorates as the inter-camera angle increases. Finally, a range of inter-camera angles for best use of either stereo-vision or simulcast coding is determined.
Multiscale fuzzy reasoning (MFR) for automatic object extraction
A new Multiscale Fuzzy Reasoning (MFR) based image-processing technique for automatic object extraction is described. MFR utilizes skewed versions of an input frame wherein each skewed frame is processed independently using fuzzy reasoning. MFR achieves optimal edge detection using the wavelet decomposition followed by a novel fuzzy based decision technique. The processed frames are de-skewed and combined using a fuzzy union operation to extract the objects. © 2004 Elsevier B.V. All rights reserved.
Development of an Embedded System Based Prototype Automatic Coffee Machine and Detailed Analysis of Its Modules
Effect of inter-camera angles on the performance of an H.264/AVC based multi-view video codec
This paper investigates the effect of inter-camera angles on the performance of an H.264/AVC based multi-view video codec. To achieve this, the H.264/AVC software has been modified to support multi-view video coding using its multi-frame reference property. Results were generated using a wide baseline convergent multi-view video data set: Breakdancers. To generate a set of three synchronized multi-view videos from the same scene with different inter-camera angles, all possible three camera combinations are generated and classified according to their inter-camera angles. The resulting set of multi-view videos are coded using H.264/AVC based multi-view and simulcast video codecs at different bitrates. Results demonstrate that the multi-view video codec gives superior coding performance up to 1.2dB compared to that of simulcast coding scheme at low inter-camera angles and it deteriorates as the inter camera angles increase. Finally, a range of inter-camera angles for best use of either multi-view or simulcast coding is determined. © 2012 IEEE.
Colour volumetric compression for realistic view synthesis applications
Colour volumetric data, which is constructed from a set of multi-view images, is capable of providing realistic immersive experience. However it is not widely applicable due to its manifold increase in bandwidth. This paper presents a novel framework to achieve scalable volumetric compression. Based on wavelet transformation, data rearrangement algorithm is proposed to compact volumetric data leading to high efficiency of transformation. The colour data is rearranged using the characteristics of human visual system. A pre-processing scheme for adaptive resolution is also proposed in this paper. The low resolution overcomes the limitation of the data transmission at low bitrates, whilst the fine resolution improves the quality of the synthesised images. Results show significant improvement of the compression performance over the traditional 3D coding. Finally, effect of using residual coding is investigated in order to show a trade off between the compression and view synthesis performance. © Springer Science+Business Media, LLC 2010.
Volumetric Reconstruction with Compressed Data
Volumetric reconstruction is one of the 3D processing technologies for multi-view systems. The existing methods of volume reconstruction have been proposed by exploiting the original views with many constraints. These algorithms may not be suitable for the distributed camera system where multiple views are transmitted via lossy networks before processing at the computing centre. Therefore, this paper proposes a novel volumetric reconstruction from the compressed multi-view images. The algorithm starts with the depth registration, and then the initial volume is refined. Finally the colour selection scheme provides the realistic colour to the volume thereby achieving subjective quality of rendering views as shown in the subjective results. The performance of the proposed scheme reconstructing with the compressed views may be inferior than reconstructing with the original views up to 3 dB, but it can be superior over 3 dB at low bitrate. ©2007 IEEE.
An adaptive reference frame re-ordering algorithm for H.264/AVC based multi-view video codec
This paper proposes an adaptive reference frame re-ordering for H.264/AVC based multi-view video codecs. The algorithm relies on statistical analysis of block matching among reference frames at low bitrate. The coded macroblocks are statistically analysed and the corresponding order for reference frames is then determined. The adaptive reference frame re-ordering algorithm is evaluated for two scenarios. In the first scenario, the multi-view videos are coded using a prediction structure with a number of reference frames. In the second scenario, a video sequence that contains several scene changes is coded. The proposed algorithm has been tested using two different prediction structures for both scenarios. The measurements were carried out on four standard multi-view datasets in addition to a sequence that contains several scenes changes. Results show that the application of the proposed reference frame re-ordering algorithm significantly saves up to 6.2% of the bitrate when coding a sequence with multiple scene changes and up to 0.2 dB when coding a sequence using multiple reference frames at low bitrate. © 2013 EURASIP.
H.264/AVC based multi-view video codec using the statistics of block matching
This paper proposes two reference frame architectures for H.264/AVC based multi-view video codecs. To achieve this, the block matching amongst reference frames of the codec are statistically analyzed. Based on the resulting statistics, two sets of reference frame architectures for best coding performance of the codec are proposed. The coding performance of the codec using the proposed reference frame architectures are assessed against the same codec which uses three different reference frame architectures. The measurements were carried out on four standard multi-view datasets. Results show that the application of the proposed reference frame architectures significantly (up to 2.3 dBs) improves the coding performance of the codec. © 2013 Croatian Society Electronics in Marine - ELMAR.
Muliresolution, adaptive vector quantization and perceptual based multiview image codec
This paper presents a multiresolution adaptive vector quantization and perceptual based multiview image coding scheme. It decorrelates the input views into a number of subbands using a lifting based wavelet transform. The coefficients in the same subbands of different views are divided into vectors and then joined together. The resulting vectors are then vector quantized using an adaptive vector quantization scheme. Perceptual weights are designed for different viewing distances and used in the vector selection and bit allocation stages of the adaptive vector quantization technique. In order to evaluate the performance of the proposed codec, two sets of multiview test images were coded using the proposed codec with and without employing perceptual weights and the monoview vector quantization coding algorithm. Results indicated that the proposed codec with and without using perceptual weights significantly outperform the basic vector quantization technique. Results also showed that the proposed technique with perceptual weights gave superior objective and subjective image quality compared to the algorithm without perceptual weights.
Application of ZigBee and RFID Technologies in Healthcare in Conjunction with the Internet of Things
The paper outlines the application of ZigBee and Radio Frequency Identification (RFID) technologies using the Internet of Things (IoT) in a healthcare environment. The Internet of Things, RFID, ZigBee and Cloud computing are emerging technologies in computing, and the association between them can be used to control complex human interaction and to reduce cost and task times through connecting Smart devices to the Internet. The extraordinary growth of those technologies has made it possible to identify, locate and track objects in various environments. These technologies could allow pervasive computing systems to be more efficiently managed in terms of locating and tracking of objects in future operations. RFID technology is a non-contact identification that does not require direct line of sight to the target object. RFID is cheap (<5 p for passive tag) and reliable, but the coverage zone is limited especially for a passive tag (<11m). In the proposed system, the combination of ZigBee and RFID is designed from two aspects, firstly to provide access control and location position and secondly, to use low power output as ZigBee devices offer a larger coverage to monitor and to confirm the location of the object and continually check the position of the target objects. The proposed system uses the combination of both emerging technologies with ZigBee used to detect and locate an object and RFID used to identify the 'gate' or 'door' positions of an object at designated floor level. © 2013 ACM.
Multi-view image coding with wavelet lifting and in-band disparity compensation
This paper presents a novel framework to achieve scalable multi-view image coding. As open loop operation, the wavelet lifting scheme for geometric filtering has been exploited to overcome the limitation of SNR scalability and to attain view scalability. The essential key for achieving the spatial scalability is the in-band prediction. It removes correlations among subbands level-by-level via shift-invariant references obtained by Overcomplete Discrete Wavelet Transforms (ODWT). Additionally, the proposed disparity compensated view filtering is allowed to exploit the different filters and estimation parameters for each resolution level. The experiments show comparable results at full resolution and the significant improvement at coarser resolution over the conventional spatial prediction scheme.
Fuzzy, Weighted-Offset, Multiscale Edge Detection for automatic echocardiographic LV boundary extraction
Machine Vision and Augmented Intelligence Select Proceedings of MAI 2022
This book comprises the proceedings of the International Conference on Machine Vision and Augmented Intelligence (MAI 2022). The conference proceedings encapsulate the best deliberations held during the conference.
Lung cancer continues to be the leading disease of patient death and disability all over the world. Many metabolic abnormalities and genetic illnesses, including cancer, can be fatal. Histological diagnosis is one of the important parts to determine the form of malignancy. Thus, one of the most significant research challenges is to explore the classification of lung cancer based on histopathology images. The proposed method encompasses ensemble learning for the classification of lung cancer and its subtype which employs pre-train deep learning models (EfficientNetB3, InceptionNetV2, ResNet50, and VGG16). The ensemble model has been created utilizing VotingClassifier in soft voting mode. The ensemble model is fit using the extracted features (features_train) and training labels (y_train). The LC25000 database's images of lung tissues are utilized to train and evaluate the ensemble classifiers. Our proposed method has an average F_I score of 99.33%, recall of 99.33%, precision of 99.33%, and accuracy of 99.00% for lung cancer detection. The findings of the analysis demonstrate that our proposed approach performs noticeably better compared to existing models. This technology is more suited to handle a wide range of classification challenges than using a single classifier alone and could improve the accuracy of predictions.
The integration of home automation systems has become increasingly prevalent, driven by advancements in sensor technologies and microcontroller capabilities. This paper presents the design, implementation, and evaluation of a comprehensive home automation system utilizing various sensors and control mechanisms. The system includes motion detectors, thermal sensors, presence sensors, light sensors, and devices such as fans and lighting systems. The effectiveness of the system in enhancing energy efficiency, comfort, and convenience is evaluated based on its real-world performance. Keywords: Home Automation, Sensor Integration, Energy Efficiency, Temperature Control, Lighting Management
Development of efficient video codecs for low bitrate video transmission with higher compression efficiency has been an active area of research over past years and various techniques have been proposed to fulfil this need. In this paper, a mixed resolution based video codec for low bitrate transmission within the standard HEVC codec’s framework is proposed. A spatial resolution scaling type of mixed resolution coding model for monoscopic videos using HEVC codec is presented. The proposed mixed-resolution structure and reference frames structure simplifies the implementation of a mixed-resolution based HEVC codec that can code video frames with different resolutions. In order to evaluate the performance of the proposed codec; three 4:2:0 format test video sequences, namely “Cactus”, "KristenAndSara" and “ParkScene”, were selected and coded using the proposed codec and the standard HEVC codec. Experimental results show that the proposed mixed resolution based HEVC codec gives a significantly higher coding performance to that of the standard HEVC codec at low bitrates.
Multiresolution statistical and vector quantization based video codec
This paper presents a novel hybrid multiresolution statistical and vector quantization based video coding scheme. In the intra mode of operation, a wavelet transform is used to decorrelate the input frame into a number of subbands. The high frequency subbands are coded using a novel statistically based coding algorithm. In the inter mode of operation, overlapped block motion estimation/compensation is employed to exploit interframe redundancy. A wavelet transform is then applied to the displaced frame difference to decorrelate it into a number of subbands. The coefficients in the resulting subbands are coded using an adaptive vector quantization scheme. To evaluate the performance of the proposed codec, the proposed codec and the adaptive subband vector quantization coding scheme (ASVQ), which has been shown outperforms H.263 at all bitrates, were applied to a number of test sequences. Results indicate that the proposed codec outperforms ASVQ subjectively and objectively at all bit rates.
Stereo image representation using compressive sensing
This paper presents a compressive sensing based stereo image representation technique using wavelet transform gain. The pair of input stereo images is first decomposed into its low-pass and high-pass views using a motion compensated lifting based wavelet transform. A 2D spatial wavelet transform is then further de-correlates the low-pass view into its sub-bands. Wavelet transform gains are employed to regulate threshold value for different sub-bands. The coefficients in high frequency sub-bands and high-pass view are then hard thresholded to generate their sparse sub-bands and view. The compressive sensing method is then used to generate measurements for different resulting sparse sub-bands and view. The baseband coefficients and measurements are finally losslessly coded. The application of compressive sensing in compressing natural images is in its early stages. Therefore, their performances are usually compared with each other than standard codecs. The performance of the proposed codec is superior to the state of the art and is superior to JPEG subjectively. © 2011 IEEE.
PROGRESSIVE DCT BASED IMAGE CODEC USING STATISTICAL PARAMETERS
This paper presents a novel progressive statistical and discrete cosine transform based image-coding scheme. The proposed coding scheme divides the input image into a number of non-overlapping pixel blocks. The coefficients in each block are then decorrelated into their spatial frequencies using a discrete cosine transform. Coefficients with the same spatial frequency at different blocks are put together to generate a number of matrices, where each matrix contains coefficients of a particular spatial frequency. The matrix containing DC coefficients is losslessly coded to preserve visually important information. Matrices, which consist of high frequency coefficients, are coded using a novel statistical encoder developed in this paper. Perceptual weights are used to regulate the threshold value required in the coding process of the high frequency matrices. The coded matrices generate a number of bitstreams, which are used for progressive image transmission. The proposed coding scheme, JPEG and JPEG2000 were applied to a number of test images. Results show that the proposed coding scheme outperforms JPEG and JPEG2000 subjectively and objectively at low compression ratios. Results also indicate that the decoded images using the proposed codec have superior subjective quality at high compression ratios compared to that of JPEG, while offering comparable results to that of JPEG2000.
Image resolution enhancement using multi-wavelet and cycle-spinning
In this paper a multi-wavelet and cycle-spinning based image resolution enhancement technique is presented. The proposed technique generates a high-resolution image for the input low-resolution image using the input image and an inverse multi-wavelet transform (all multi-wavelet high frequency subbands' coefficients are set to zero). The concept of the cycle spinning algorithm in conjunction with the multi-wavelet transform is then used to generate a high quality super-resolution image for the input image from the resulting high resolution image, as follows: A number of replicated images with different spatial shifts from the resulting high-resolution image is first generated; Each of the replicated images is de-correlated into its subbands using a multi-wavelet transform; The multi-wavelet high frequency subbands' coefficients of each of the de-correlated images are set to zero and then a primary super-resolution image for each of these images is produced using an inverse multi-wavelet transform; The resulting primary super-resolution images are then spatially shift compensated and the output super-resolution image is created by averaging the resulting shift compensated images. Experimental results were generated using four standard test images and compared to the state of art techniques. Results show that the proposed technique significantly outperforms the classical and non-classical super-resolution methods both subjectively and objectively. © 2012 IEEE.
Multi-scale, Perceptual and Vector Quanitzation Based Video Codec
This paper presents a novel hybrid Multi-scale, perceptual and vector quantization based video coding scheme. In intra mode of operation, a wavelet transform is applied to the input frame and decorrelate it into a number of subbands. The lowest frequency subband is losslessly coded. The coefficient of the high frequency subbands are pixel quantized using perceptual weights, which specifically designed for each high frequency subband. The quantized coefficients are then coded using quadtree-coding scheme. In the inter mode of operation, displaced frame difference is generated using overlapped block motion estimation / compensation to exploit the inter-frame redundancy. A wavelet transform is then applied to the displaced frame difference to decorrelate it into a number of subbands. The coefficients in the resulting subbands are coded using an adaptive vector quantization scheme. To evaluate the performance of the proposed codec, the proposed codec and the adaptive subband vector quantization coding scheme (ASVQ), which has been shown outperforms H.263 at all bitrates, were applied to a number of test sequences. Results indicate that the proposed codec outperforms ASVQ subjectively and objectively at all bit rates. © 2007 IEEE.
Disparity compensated view filtering wavelet and compressive sampling based multi-view image codec
This paper presents a multi-view image codec using disparity compensated lifting based wavelet transform and Compressive Sampling (CS). Disparity compensated view filtering lifting based wavelet transforms are applied to the input multi-view images decomposing the images into their view sub-bands. The dense view is further decomposed into its spatial sub-bands using a wavelet transform. High frequency coefficients are hard threshold to improve and also to control their sparsity. For high frequency sub-bands/views, wavelet-weights are calculated and used to regulate threshold values for those sub-bands/views. The CS algorithm is then used to generate measurements for each resulting sparse sub-band. In the decoder side, the Basis Pursuit method is used to recover the dominant coefficients. An assessment on the energy of the non-dominant coefficients at different compression ratios and their effect on the quality of the reconstructed images are given. Results show that the proposed codec out performs the state of art codecs. © 2013 Croatian Society Electronics in Marine - ELMAR.
Wavelet-based video codec using human visual system coefficients for 3G mobiles
A new wavelet based video codec that uses human visual system coefficients is presented. In INTRA mode of operation, wavelet transform is used to split the input frame into a number of subbands. Human Visual system coefficients are designed for handheld videophone devices and used to regulate the quantization step-size in the pixel quantization of the high frequency subbands' coefficients. The quantized coefficients are coded using quadtree-coding scheme. In the INTER mode of operation, the displaced frame difference is generated and a wavelet transform decorrelates it into a number of subbands. These subbands are coded using adaptive vector quantization scheme. Results indicate a significant improvement in frame quality compared to motion JPEG2000.
A Novel H.264/AVC Based Multi-View Video Coding Scheme
This paper investigates extensions of H.264/AVC for compressing multi-view video sequences. The proposed technique re-sorts frames of sequences captured by multiple cameras looking at a person in a scene from different views and generates a single video sequence. The multi-frame referencing property of the H.264/AVC, which enables exploitation of the spatial and temporal redundancy contained in the multi-view sequences, is employed to implement several modes of operation in the proposed coding algorithm. To evaluate the performance of the proposed coding technique at different modes of operations, five multi-view video sequences at different frame rates were coded using the proposed and the simulcast H.264/AVC coding schemes. Experiments show the superior performance of the proposed coding scheme when coding the multi-view sequences at low and up to half of the original frame rates. © 2007 IEEE.
Multi-resolution, perceptual and compressive sampling based image codec
Direct application of compressive sampling in coding wavelet high frequency coefficients of an image, is unpleasantly deteriorating the quality of the reconstructed image. This is due to an error introduced by many high frequency coefficients that have small but nonzero values. In this paper, a novel multi-resolution image coding scheme using compressive sampling and perceptual weights is presented that significantly improves the quality of the reconstructed images by setting the coefficients with small values to zero using two different hard thresholding operators. The proposed codec applies a wavelet transform on the input image and decorrelates the image into its frequency subbands. Baseband coefficients are lossless coded to preserve their visually important information. High frequency subbands' coefficients are hard threshold to improve and also to control their sparsity. Perceptual-weights for different wavelet subbands are calculated and used to adjust threshold values for different subbands. Compressive sampling algorithm is used to generate measurements for each resulting sparse subband. Measurements for each subband are then cast to an integer and arithmetic coded. In the decoder side, the Basis Pursuit method is used to recover the coefficients. Empirical values for the observation factor for best coding performance of the codec, using standard test images, were first determined. The performance of the codec was assessed using standard test images. Results show that the application of perceptual weights in regulating threshold values significantly improves the coding performance of the codec.
Fish Quality Assessment Using Hyperspectral Imaging and Computer Vision: A Review
The global food industry prioritizes the quality and safety of fish and seafood products due to their perishable nature and the increasing consumer demand for nutritious, high-quality protein sources. Traditional quality assessment methods—such as sensory evaluation, chemical analysis, physical testing, and microbiological testing—form the foundation of current practices but face notable limitations, including subjectivity, destructiveness, and labor-intensive procedures that delay results. Motivated by the need for faster, more reliable, and nondestructive quality control systems, this article investigates how emerging technologies can address these limitations. Specifically, it aims to answer the following review questions: 1) what are the limitations of traditional fish quality assessment methods? 2) how can hyperspectral imaging (HSI) and computer vision improve the accuracy and efficiency of quality assessment? and 3) what roles do machine learning (ML) and deep learning (DL) techniques play in enhancing these technologies? This article explores the integration of HSI and computer vision as cutting-edge, noninvasive technologies enabling real-time, comprehensive evaluation of key fish quality attributes such as freshness, safety, nutritional content, and species verification. The fusion of HSI and computer vision with advanced learning algorithms enhances precision in quality control, reduces food waste, and supports compliance with modern standards. Finally, this article underscores the need for continued research to drive sustainable innovation and strengthen consumer confidence.
This paper presents a mixed resolution stereo video coding model for High Efficiency Video Codec (HEVC). The challenging aspects of mixed resolution video coding are enabling the codec to encode frames with different frame resolution/size and using decoded pictures having different frame resolution/size for referencing. These challenges are further enlarged when implemented using HEVC, since the incoming video frames are subdivided into coding tree units. The ingenuity of the proposed codec’s design, is that the information in intermediate frames are down-sampled and yet the frames can retain the original resolution. To enable random access to full resolution decoded frame in the decoded picture buffer as reference frame a downsampled version of the decoded full resolution frame is used. The test video sequences were coded using the proposed codec and standard MV-HEVC. Results show that the proposed codec gives a significantly higher coding performance over the MV- HEVC codec.
Activities (2)
Sort By:
Featured First:
Search:
From Pixels to Proof: Forensic Techniques for Source Camera Identification]
Detecting Subcutaneous Veins Using Hyperspectral Imaging
Current teaching
- L7 Vision and Image Systems (module leader)
- L7 MSc Dissertation (module leader)
- L7 MSc Dissertation supervision
- L7 Research Practice
- L7 Engineering System Control
- L6 BSc and MEng Production Project supervision
- L6 Digital Signal Processing
- L5 Analogue Electronics (module leader)
- L5 Embedded Systems
- L4 Electrical and Electronics Principles 1 (module leader)
- L4 Engineering System and Data Acquisition (module leader)
- L4 Computer System Architecture
Grants (4)
Sort By:
Featured First:
Search:
Research, develop and implement a scalable and modular system which monitors and analyses individual behavioural patterns and movements in a range of environments
Leeds Beckett University and Riverside Greetings mKTP
Leeds Beckett University and Fathers Farm Foods Ltd
Developing hyper-spectral imaging for aflatoxin screening in pistachios
News & Blog Posts
New Leeds Beckett University partnership to simplify solar energy adoption for businesses
- 23 Feb 2026
Leeds Beckett University supporting Wakefield greetings card company's growth with innovative use of RFID technology
- 26 Sep 2024
Leeds Beckett to increase food safety using innovative new technology
- 26 Mar 2024
New RFID and AI technology set to revolutionise performance of West Yorkshire greetings card company
- 13 Nov 2023
{"nodes": [{"id": "19660","name": "Dr Akbar Sheikh Akbari","jobtitle": "Reader","profileimage": "/-/media/images/staff/lbu-approved/beec/akbar-sheikh-akbari.jpg","profilelink": "/staff/dr-akbar-sheikh-akbari/","department": "School of Built Environment, Engineering and Computing","numberofpublications": "141","numberofcollaborations": "141"},{"id": "29948","name": "Sina Mahroughi","jobtitle": "Research Associate - Image Processing","profileimage": "/-/media/images/staff/default.jpg","profilelink": "none","department": "Knowledge Exchange","numberofpublications": "0","numberofcollaborations": "2"},{"id": "21342","name": "Dr Pooneh Bagheri Zadeh","jobtitle": "Course Director","profileimage": "/-/media/images/staff/dr-pooneh-bagheri-zadeh.jpg","profilelink": "/staff/dr-pooneh-bagheri-zadeh/","department": "School of Built Environment, Engineering and Computing","numberofpublications": "47","numberofcollaborations": "30"},{"id": "28585","name": "Dr Edward Ofoegbu","jobtitle": "Senior Lecturer","profileimage": "/-/media/images/staff/dr-edward-ofoegbu.jpg","profilelink": "/staff/dr-edward-ofoegbu/","department": "School of Built Environment, Engineering and Computing","numberofpublications": "14","numberofcollaborations": "3"},{"id": "21809","name": "Dr Anatoliy Gorbenko","jobtitle": "Reader","profileimage": "/-/media/images/staff/dr-anatoliy-gorbenko.jpg","profilelink": "/staff/dr-anatoliy-gorbenko/","department": "School of Built Environment, Engineering and Computing","numberofpublications": "67","numberofcollaborations": "7"},{"id": "29565","name": "Dr Farrukh Saleem","jobtitle": "Senior Lecturer","profileimage": "/-/media/images/staff/dr-farrukh-saleem.png","profilelink": "/staff/dr-farrukh-saleem/","department": "School of Built Environment, Engineering and Computing","numberofpublications": "44","numberofcollaborations": "4"},{"id": "30574","name": "Hafiz Muhammad Shakeel","jobtitle": "Research Officer","profileimage": "/-/media/images/staff/default.jpg","profilelink": "none","department": "School of Built Environment, Engineering and Computing","numberofpublications": "22","numberofcollaborations": "1"},{"id": "6513","name": "Professor Ah-Lian Kor","jobtitle": "Professor","profileimage": "/-/media/images/staff/lbu-approved/beec/ah-lian-kor.jpg","profilelink": "/staff/professor-ah-lian-kor/","department": "School of Built Environment, Engineering and Computing","numberofpublications": "145","numberofcollaborations": "5"},{"id": "10777","name": "Kiran Voderhobli","jobtitle": "School Director of Partnerships and Global Engagement","profileimage": "/-/media/images/staff/kiran-voderhobli.jpg","profilelink": "/staff/kiran-voderhobli/","department": "School of Built Environment, Engineering and Computing","numberofpublications": "8","numberofcollaborations": "2"},{"id": "13359","name": "Dr Cliffe Schreuders","jobtitle": "Reader","profileimage": "/-/media/images/staff/lbu-approved/beec/cliffe-schreuders.jpg","profilelink": "/staff/dr-cliffe-schreuders/","department": "School of Built Environment, Engineering and Computing","numberofpublications": "46","numberofcollaborations": "4"},{"id": "6797","name": "Dr Esther Pugh","jobtitle": "Senior Lecturer","profileimage": "/-/media/images/staff/dr-esther-pugh.jpg","profilelink": "/staff/dr-esther-pugh/","department": "Leeds Business School","numberofpublications": "7","numberofcollaborations": "1"},{"id": "4783","name": "Dr Nick Halafihi","jobtitle": "Head of Subject","profileimage": "/-/media/images/staff/dr-nick-halafihi.jpg","profilelink": "/staff/dr-nick-halafihi/","department": "Leeds Business School","numberofpublications": "9","numberofcollaborations": "1"},{"id": "22529","name": "Hadi Kazemi","jobtitle": "Course Director","profileimage": "/-/media/images/staff/lbu-approved/beec/hadi-kazemi.jpg","profilelink": "/staff/hadi-kazemi/","department": "School of Built Environment, Engineering and Computing","numberofpublications": "22","numberofcollaborations": "1"},{"id": "27907","name": "Dr Sepehr Ghaffari","jobtitle": "Senior Lecturer","profileimage": "/-/media/images/staff/dr-sepehr-ghaffari.jpg","profilelink": "/staff/dr-sepehr-ghaffari/","department": "School of Built Environment, Engineering and Computing","numberofpublications": "38","numberofcollaborations": "4"},{"id": "31509","name": "Niraj Buyo","jobtitle": "Part-Time Lecturer","profileimage": "/-/media/images/staff/niraj-buyo.png","profilelink": "/staff/niraj-buyo/","department": "School of Built Environment, Engineering and Computing","numberofpublications": "1","numberofcollaborations": "1"},{"id": "12931","name": "Professor Theocharis Ispoglou","jobtitle": "Professor","profileimage": "/-/media/images/staff/dr-theocharis-ispoglou.jpg","profilelink": "/staff/professor-theocharis-ispoglou/","department": "Carnegie School of Sport","numberofpublications": "130","numberofcollaborations": "1"},{"id": "21388","name": "Dr John George","jobtitle": "Reader","profileimage": "/-/media/images/staff/default.jpg","profilelink": "/staff/dr-john-george/","department": "School of Health","numberofpublications": "9","numberofcollaborations": "1"},{"id": "2314","name": "Dr Pip Trevorrow","jobtitle": "Course Director","profileimage": "/-/media/images/staff/dr-pip-trevorrow.jpg","profilelink": "/staff/dr-pip-trevorrow/","department": "School of Built Environment, Engineering and Computing","numberofpublications": "38","numberofcollaborations": "1"},{"id": "11801","name": "Dr Mark Dixon","jobtitle": "Course Director","profileimage": "/-/media/images/staff/dr-mark-dixon.jpg","profilelink": "/staff/dr-mark-dixon/","department": "School of Built Environment, Engineering and Computing","numberofpublications": "26","numberofcollaborations": "1"},{"id": "17106","name": "Lisa Halmshaw","jobtitle": "Course Administrator","profileimage": "/-/media/images/staff/lisa-halmshaw.jpg","profilelink": "none","department": "School of Built Environment, Engineering and Computing","numberofpublications": "1","numberofcollaborations": "1"}],"links": [{"source": "19660","target": "29948"},{"source": "19660","target": "21342"},{"source": "19660","target": "28585"},{"source": "19660","target": "21809"},{"source": "19660","target": "29565"},{"source": "19660","target": "30574"},{"source": "19660","target": "6513"},{"source": "19660","target": "10777"},{"source": "19660","target": "13359"},{"source": "19660","target": "6797"},{"source": "19660","target": "4783"},{"source": "19660","target": "22529"},{"source": "19660","target": "27907"},{"source": "19660","target": "31509"},{"source": "19660","target": "12931"},{"source": "19660","target": "21388"},{"source": "19660","target": "2314"},{"source": "19660","target": "11801"},{"source": "19660","target": "17106"}]}
Dr Akbar Sheikh Akbari
19660



