Question

In: Computer Science

(Java)Run length encoding is a simple form of data compression. It replaces long sequences of a...

(Java)Run length encoding is a simple form of data compression. It replaces long sequences of a repeated value with one occurrence of the value and a count of how many times to repeat it. This works reasonably well when there are lots of long repeats such as in black and white images. To avoid having to represent non-repeated runs with a count of 1 and the value, a special value is often used to indicate a run and everything else is just treated as a simple value. For this exercise you will decompress one line of input at at time assuming the line was compressed according to the following:

The newline characters are passed through uncompressed even when there are repeated blank lines. When a run of n>=3 repeated characters, c, is detected where c is NOT a digit, the run will be replaced with #nc. When a run of n>=3 repeated characters, c, is detected where c IS a digit, the run will be replaced with #n#c. The extra # is needed to avoid confusing the repeated digit c with the last digit of the run count. All other characters (i.e. runs of just 1 or 2 characters) are passed through unmodified. Assume the uncompressed input does not contain the symbol '#'. This assumption can be eliminated. You might think about how you would do it.

Some examples:

abc decompresses to abc

#3ab#4c decompresses to aaabcccc

abc12#14#3 decompresses to abc1233333333333333

Your decoder can assume the input was properly compressed, i.e. no need for error checking. Your program must include a public static method decompress() that takes one parameter, a String, that is the line to be decompressed. The method returns the decompressed String. Write Decoder.java to answer this question.

Solutions

Expert Solution

The length encoding is a simple form of data compression. It replaces long sequences of a repeated value with one occurrence of the value and a count of how many times to repeat it. This works reasonably well when there are lots of long repeats such as in black and white images. To avoid having to represent non-repeated runs with a count of 1 and the value, a special value is often used to indicate a run and everything else is just treated as a simple value. For this exercise you will decompress one line of input at at time assuming the line was compressed according to the blow giving code and the data.

package com;

import java.util.HashMap;

public class Encoder {

public static void main(String[] args) {

String s = "aabc1233333333333333";
s = "abc";
/*s = ""
+ " "
+ " "
+ "";*/
//s= "aaabcccc";
//s = "aaa";
s = "aa";
System.out.print(compress(s));
//System.out.print(countCharacters(s));

}

/**
* This method will accept the String and count the each character and put it in the map as key value pair.
* Key : Character
* Value : Count
*
* Count is start from 0 like an array. If count = 0 then it means it has 1 value.
* example -
* input : "aabc1233333333333333"
* Output : {a=1, 1=0, b=0, 2=0, c=0, 3=13}
*
* @param input - It takes string as input.
* @return HashMap contain count of each character.
*/
public static HashMap<Character, Integer> countCharacters(String input ) {

HashMap<Character, Integer> map = new HashMap<Character, Integer>(); // Creating a map for counting character
int len = input.length();
char[] characters = input.toCharArray();

/** Here we creating a character array of given String and iterating each character.
If map contains that character then we increase it's value.
If map not contains that character then we put 0
*/
for (int i = 0; i < len; i++) {
if (map.containsKey(characters[i])) {
Integer count = map.get(characters[i]);

map.put(characters[i], ++count);
} else {
map.put(characters[i], 0);
}
}
return map;

}

/**
* This method take input as String. Process that String as given condition and return as encoded String.
* For detail understanding see each lines comments.
* Example
* Input : abc
* Output : abc
*
* Input : aaabcccc
* Output : #3ab#4c
*
* @param input - It takes string as input.
* @return encoded String
*/
public static String compress(String input) {

int len = input.length();
/**
* If the given String length is less than 3 then we retrun same String.
* Example :
* Input : ab
* Output : ab
*/
if(len < 3 ){
return input;
}

/*Call the method that will return map who contains count of each character*/
HashMap<Character, Integer> map = countCharacters(input);

StringBuffer finalString = new StringBuffer();
char[] characters = input.toCharArray();

/**
* For Each loop we check if the given character is present in the map or not.
* If char is Present in a Map -
* we take count from Map.
* If count is gretter than or equal to 2 .
* then we check it is digit or character for differentiating for '#'
* based on we append that character to finalString.
* If Count is less than 2
* then we direct append that character to finalString.
*
*
* And Finally we remove this character from map. For remove repetition. The If case used for not to come exception ArrayIndexOutofBound.
* Means i is always == Or less than then then only we can remove from map.
* If char is not Present in a Map -
* This case will never occur
*
*
*
*/
for (int i = 0; i < len; i++) {

if (map.containsKey(characters[i])) {

int count = map.get(characters[i]);

if (count != 0 && count >= 2) {

if (Character.isDigit(characters[i])) {
finalString.append("#" + ++count + "#" + characters[i]);

} else {
finalString.append("#" + ++count + characters[i]);

}

} else {
finalString.append(characters[i]);
}

if ((i + 1) <= len && (i + 2) <= len && characters[i] == characters[i + 1]
&& characters[i] == characters[i + 2]) { /*The If case used for not to come exception ArrayIndexOutofBound. Means i is always == Or less than*/
map.remove(characters[i]);
}
}
}

return finalString.toString();

}
}


Related Solutions

Run Length Encoding It is sometimes important to minimise the space used for storing data. This...
Run Length Encoding It is sometimes important to minimise the space used for storing data. This idea of data compression can be implemented in many forms. One of the simpler techniques is known as Run Length Encoding (RLE). RLE takes advantage of the fact that in many cases, data will contain a run of repeated 0s or 1s, and these runs can be represented by a shorter code. RLE is a technique used in some image or sound formats to...
C Programming Run Length Encoder (Compression). I am writing a program that takes an image (from...
C Programming Run Length Encoder (Compression). I am writing a program that takes an image (from a text file) and looks for patterns of 2 or more duplicate values. The program should replace these values with the following pattern:  2 such characters followed by an Integer ( which represents number of occurrences of each character), followed by an asterisk. For example say the input of the non-compressed image was: ,,,,)H 7. i.e. it consists of four commas, one closed bracket, the...
1) how do you decide how long to run a simulation for? (Run length) 2) any...
1) how do you decide how long to run a simulation for? (Run length) 2) any two methods for getting a sample for a non terminating solution 3) for a system where no data is available, how do you decide which probability distribution to use? And what do you do if none of the distributions work for you? 4) Why is average utilization a time weighted average? When do we use simple average ?
Question 1. In this exercise you will simulate a data set and run a simple regression....
Question 1. In this exercise you will simulate a data set and run a simple regression. To ensure reproducible results, make sure you use set.seed(1). a)   Using the rnorm() function create vector X that contains 200 observations from N(0,1) distribution. Similarly, create a 200 element vector, epsilon (ϵ), drawn from N(0,0.25) distribution. This is the irreducible error. b)   Create the response data using the following relationship: Y=−1+0.5X+ϵ Fit a linear regression of Y on X. Display the summary statistics and...
The following data depict the length of delay in discharging 18 older patients to a long-term...
The following data depict the length of delay in discharging 18 older patients to a long-term care facility: 5,3,4,1,2,6,7,4,8,10,3,5,3,5, 4,6,7,9 Use Excel to find the mean, the mode, the median, the standard deviation, the range, the coefficient of skewness, the 3rd quartile, the 6th decile, and the 25th percentile.
the data is as follows: expected long-run growth rate: 5%, cost of equity is 10%, forecast...
the data is as follows: expected long-run growth rate: 5%, cost of equity is 10%, forecast horizon is 3 years, discounted (t=0) Terminal value = $500. Free cash flows to equity in year 1 is 100, year 2 is 150 and year 3 is 120. What is the value of the firm at today, i.e., t = 0?
Consider this table containing long run production data: Units of Capital, K Units of Labor, L...
Consider this table containing long run production data: Units of Capital, K Units of Labor, L 1 2 3 4 1 100.00 131.95 155.18 174.11 2 141.42 186.61 219.46 246.23 3 173.21 228.55 268.79 301.57 4 200.00 263.90 310.37 348.22             Does this production function exhibit a constant return to scale? Explain your answer.             How many units of output are produced with 4 units of labor and 2 units of capital?              Suppose the firm is using 3 units...
Sketch a graph of f(x)= -2(x-1)^2(x+1), based on all the data about zeros, long run behavior,...
Sketch a graph of f(x)= -2(x-1)^2(x+1), based on all the data about zeros, long run behavior, etc.
ADVERTISEMENT
ADVERTISEMENT
ADVERTISEMENT