You can download the solution Decision Types – Assignment Collaboration With Google Drive for free. For further assistance in Information Technology Assignments please check our offerings in Information Technology assignment solutions. Our subject-matter experts provide online assignment help to Information Technology students from across the world and deliver plagiarism free solution with free Turnitin report with every solution.
(ExpertAssignmentHelp does not recommend anyone to use this sample as their own work.)
Question
Project : Anomaly detection in Big Social Data
This project aims to research novel approaches to anomaly detection in big social big data. The complexity, dynamicity, heterogeneity and scale of big social data calls for new, advanced analytical anomaly detection techniques. The technologies generated from this research can be applied to opinion spam detection and fake news detection in social media as well as deception identification in other cyber security applications. The social nature of Web 2.0 leads to the unprecedented growth of social media sites such as discussion forums, product review sites, microblogging, social networking and social curation. Social media is increasingly becoming an important information, communication and socialising source and medium. Unfortunately with opportunity comes malicious activity. Spams and frauds on the social media have become a threat to the usefulness and benefits of social media. In recent years, online opinions and reviews have become an important factor in determining purchases. Product reviews are often the first point of reference for people seeking information about a purchase. Positive reviews and high star ratings generate huge financial benefit whereas negative reviews and low star ratings lead to financial loss. Due to the huge financial incentive, opinion spams, where fake reviews and ratings are written, are widespread on product review sites. In addition, people are increasingly obtaining their news online or via social networks such as Facebook. Hence the release of fake/false news can bring about huge financial or political benefits, e.g., manipulate the stock market or influence elections. For both these problems, labelled data is scarce and costly to obtain, hence supervised and signature based techniques are not suitable for detecting such social media fraud. Unsupervised, anomaly detection approaches hence have been proposed to detect social media frauds [1][2]. Anomaly detection is a powerful unsupervised approach to detect social media frauds [2]. However, traditional anomaly detection has been focused on analyzing structured, simple social media objects and unable to adequately detect the complicated online fraud behaviour, which requires the combined analysis of the complex, dynamic, heterogeneous social media data sources and their networks of interactions. Our initial research has shown that rating deviation carries strong signals for opinion spam detection [2] and interaction anomalous in social networks can be detected [3]. In this project, we will research and develop a holistic approach to anomaly detection, considering the unstructured social media contents (text), structured metainformation (e.g., get tags) as well as the links among objects in the social network. Recent research on learning distributed representations for unstructured texts and networks in combination with deep learning opens a new suite of approaches to anomaly detection for complex data types [4][5]. We will develop approaches combining machine and deep learning, text mining, natural language processing and social network analysis. This project will be built on our existing work in the area [1][2][3].
Project : Big Data Analysis for Missing Environment Data Estimation
While the air quality in macro scale is generally good in most cities in Australia (e.g., Melbourne), the air pollutions in micro scale need special attentions include (1) the regions around the construction sites and along the main roads, (2) the indoor air quality, and (3) the pollen affected regions. Since a large population are exposed to the micro scale air pollutions, it is critical to have detailed air pollution map of the city which will provide fundamental information of policy. However, the air quality monitoring in the micro scale is largely ignored in research community, and the current state of the art fails to address this problem since it is impossible to install air quality sensors in every corner of the city.
Through innovative big data analysis techniques, this project aims to track and estimate air pollution via two streams. One is to predict the air pollution by learning from historical data using various time series models [1]. The other is to estimate the air pollution of the regions which are not monitored by air monitoring station. The idea is to learn from other regions which have similar context features such as similar number of vehicles and from the same regions history such as in the same weather conditions [2]. Other research problems include the air pollution exposure to the health of residents [3], the research methodology [4] and analysis tools [5]. Technique challenges of the proposed problems focuses on the following aspects:
- The air pollution monitoring data using portable detectors are typically sparse considering the large area of Melbourne city. In order to deliver high quality micro scale air pollution map of the city, the pollution model has to consider the spatial and temporal dimensions of the data collection, has to capture the local features of region with air pollution problems, and has to consider the relationship between the density of the monitoring data and the prediction accuracy.
- The reason behind the respiratory disease may be affected by various factors. We need investigate the techniques to obtain statistically reliable association with the air pollution regions.
- To enable near real time air pollution monitor in micro scale based in portable air pollution detectors, we need techniques to model the historical data and feed the recent collected data in an efficient way.
- It is essential and challenging to associate the indoor air pollution modelling and outdoor air pollution modelling towards a comprehensive micro scale air pollution model.
The supervision team of this project has solid back ground in mobile computing and urban computing. The project will exploit the data collected by the portable air pollution detectors to address this important but largely ignored research problem. By working on this project, the PhD student will have opportunity to develop advanced skills and experience of big data analysis.
Project : Homomorphic Data Watermarking for Big IoT Data Security &
Privacy
The Internet of Things (IoT) represents a technology revolution transforming the current environment into a ubiquitous world, whereby everything that benefits from being connected will be connected [1]. Despite the benefits, maintaining the privacy of the data within these 'things' becomes a great concern and therefore it is imperative to apply privacy preservation techniques to IoT data collection. One such technique is called data watermarking in which data is subtly modified, while preserving the data utility.
Current data watermarking techniques [2], however, focus on the privacy of published datasets shared with untrusted parties. The high connectivity and distributed nature of IoT, opens up the possibility of privacy compromise before marking can take effect, and therefore privacy enforcement should be deployed at earlier stages (for example, from the sensor).
Homomorphic Data marking refers to applying the principles of homomorphism to embedded host data carrying the watermark. This is quite different to homomorphic encryption, and the two should not be confused. When applied to digital media (which host media, by nature are very error-tolerant), it makes the watermark robust to common signal processing algorithms. When applied to computer readable data streams from IoT devices or in big data environments, the properties are very different. In this project, we seek to explore those properties, and discover how embedded marks can survive even drastic data operations such as: summarization; optimization; categorization; and subset selection – operations that traditional marking systems would fail at.
Solution
Introduction about project
Advancement is integral to the survival of mankind, and the Internet of Things (IoT) offers this vision of transformation to the traditional way of operations in many fields. In an age of artificial intelligence (AI) and machines with varied kinds of sensors, we are provided not only with a plethora of information in the form of big data, but with responsibility of understanding this data and managing it (Ragett, 2015). It is vital that information must contribute to the individuals through helping them developing an understanding about the data. This in turn paves the pressure of analyzing a large number of incoming streaming data and sensor values as per the different state of machines by the stakeholders at any given time. The challenge of further comprehending the information arise owing to the increase in the technology, and their varied mechanical and sensory life cycle (Mohammadi, 2010).
Studies reflect that IoT proposes an opportunity of turning every tangible entity into a node on the Internet, thus allowing the stakeholders a chance to interoperate and reuse information across different channels (Chen and Lin, 2014). Advancements in management of IoT services, such as network compression, accelerators- for example: Eyeriss accelerator, pruning redundant connections and approximation of the bit nodes has contributed towards reducing the power and computational loads for the stakeholders (Mahdavinejad, 2018). However, even as many of these service technologies has advantages, they are limited in terms of their interoperability in real time settings, and diverse IoT services (Ragett, 2015). Furthermore, through measures such as network compression challenges with respect to the alternating pattern of the data stream are recognized (Mahdavinejad, 2018). Similarly in the approximation of the nodes size as a measure, challenges associated with accuracy and precision are noted (Chen and Lin, 2014).
Since understanding information gathered through IoT offers the opportunity of
building platforms with the use of predictive analysis, and sustaining platforms for knowledge; the relevance of deep learning technology is recognized (Schmidhuber, 2015). Deep learning offers a chance of transforming data into abstract form through the means of a multiple layers of nonlinear processing unit which can be interpreted through universal approximation theorem or probabilistic inference (Deng and Yu, 2014). Even as deep learning is well recognized for its problem solving abilities in computer and language processing, it is not deemed well suited for addressing the challenges of IoT. Most of these deep learning service technologies work on a resource-constrained platform, however, they use convolutional neural networks and thus consumes high CPU power and are time consuming.
For complete solution please download from the link below
(Some parts of the solution has been blurred due to privacy protection policy)